date:20180209

Re: [PATCH] Improve pow (C, x) -> exp (log (C) * x) optimization (PR middle-end/84309)

2018-02-09 Thread Richard Biener

On February 10, 2018 12:37:38 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>Apparently the new pow(C,x) -> exp(log(C)*x) if C > 0 optimization
>breaks some package (Marek should know which), as it has 7ulp error.
>Generally one should be prepared for some errors with -ffast-math.
>
>Though, in this case, if the target has c99 runtime and C is
>a positive 0x1pNN it seems much better to use exp2 over exp, for
>C being 2 pow (2, x) is optimized into exp2 (x) and even for other
>values log2(C) will still be some positive or negative integer, so
>in many cases there won't be any rounding errors in the multiplication.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. I wonder whether there are vectorized variants in libmvec? 

Richard. 

>Perhaps we should do something similar if C is a power of 10 (use exp10
>and log10).
>
>2018-02-10  Jakub Jelinek  
>
>   PR middle-end/84309
>   * match.pd (pow(C,x) -> exp(log(C)*x)): Optimize instead into
>   exp2(log2(C)*x) if C is a power of 2 and c99 runtime is available.
>
>   * gcc.dg/pr84309.c: New test.
> 
>--- gcc/match.pd.jj2018-01-26 12:43:23.208922420 +0100
>+++ gcc/match.pd   2018-02-09 18:48:26.412021408 +0100
>@@ -3992,15 +3992,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(logs (pows @0 @1))
>(mult @1 (logs @0
> 
>- /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
>+ /* pow(C,x) -> exp(log(C)*x) if C > 0,
>+or if C is a positive power of 2,
>+pow(C,x) -> exp2(log2(C)*x).  */
>  (for pows (POW)
>   exps (EXP)
>   logs (LOG)
>+  exp2s (EXP2)
>+  log2s (LOG2)
>   (simplify
>(pows REAL_CST@0 @1)
>-(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0)
>-   && real_isfinite (TREE_REAL_CST_PTR (@0)))
>- (exps (mult (logs @0) @1)
>+   (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0)
>+  && real_isfinite (TREE_REAL_CST_PTR (@0)))
>+(with {
>+   const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (@0);
>+   bool use_exp2 = false;
>+   if (targetm.libc_has_function (function_c99_misc)
>+ && value->cl == rvc_normal)
>+   {
>+ REAL_VALUE_TYPE frac_rvt = *value;
>+ SET_REAL_EXP (&frac_rvt, 1);
>+ if (real_equal (&frac_rvt, &dconst1))
>+   use_exp2 = true;
>+   }
>+ }
>+ (if (use_exp2)
>+   (exp2s (mult (log2s @0) @1))
>+   (exps (mult (logs @0) @1)))
> 
>  (for sqrts (SQRT)
>   cbrts (CBRT)
>--- gcc/testsuite/gcc.dg/pr84309.c.jj  2018-02-09 18:54:52.254787678
>+0100
>+++ gcc/testsuite/gcc.dg/pr84309.c 2018-02-09 18:59:02.343636178 +0100
>@@ -0,0 +1,14 @@
>+/* PR middle-end/84309 */
>+/* { dg-do run { target c99_runtime } } */
>+/* { dg-options "-O2 -ffast-math" } */
>+
>+int
>+main ()
>+{
>+  unsigned long a = 1024;
>+  unsigned long b = 16 * 1024;
>+  unsigned long c = __builtin_pow (2, (__builtin_log2 (a) +
>__builtin_log2 (b)) / 2);
>+  if (c != 4096)
>+__builtin_abort ();
>+  return 0;
>+}
>
>   Jakub

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Jeff Law

On 02/09/2018 09:39 PM, Alexandre Oliva wrote:
> On Feb  9, 2018, Alexandre Oliva  wrote:
> 
>> On Feb  9, 2018, Jeff Law  wrote:
>>> On 02/08/2018 08:53 PM, Alan Modra wrote:
 On Fri, Feb 09, 2018 at 01:21:27AM -0200, Alexandre Oliva wrote:
> Here's what I checked in, right after the LVU patch.
>
> [IEPM] Introduce inline entry point markers

 One of these two patches breaks ppc64le bootstrap with the assembler
 complaining "Error: view number mismatch" when compiling
 libdecnumber.

>>> I've just passed along a similar failure (.i, .s and command line
>>> options) to Alex for ppc64 (be) building glibc.
> 
>> Thanks.  So, I'm told there are more such issues, that non-asm insn
>> length attrs can't be relied on at this time to be nonzero only when the
>> actual length is not zero.
> 
> I wonder...  In the previously-posted patch, we still regard call insns
> are advancing PC.  I think that's a safe assumption, but...  are there
> any other kinds of patterns we could recognize that would certainly
> generate an actual PC-changing insn?  How about jump insns?  How about
> insns that are SETs, or PARALLELs containing at least one SET?  Does
> anyone see any risk in recognizing those when the length attr is,
> conservatively or not, deemed unreliable?  Any other paterns we could
> recognize to that end?
So given what I've seen in the ARM port, I don't think we can generally
assume any insn advances the PC.

Here's why.

On the ARM there's a little state machine that's used to implement
conditional execution.   When we're in certain states the backend will
generate no code for certain JUMP_INSNs and change the state.

That's fine and dandy.  THe problem is we can't actually tell outside
the ARM backend when that's happened!

You might think we could embed tests of the FSM within the length
computation.  But the state of the FSM is only valid during assembly
output (e.g. final).  But insn lengths are set up during branch
shortening and if you query the length attribute you get value computed
by branch shortening.

The only way around this would be to do what I would consider some
interface abuse and query insn_min_length (not to be confused with
get_attr_min_length).  Then we wouldn't get the cached version and we
could do queries of the state machine on the fly in
dwarf2out_var_location.  But it seems rather icky and I haven't even
been able to convince myself it's really safe.

And while this is specific to the ARM and JUMP_INSNs, I can easily
envision scenarios where other ports could use the same kind of little
state machine for standard INSNs (PA to generate add,tr) or even
CALL_INSNs (target dependent optimization of tail calls).

You also have to worry about ports that try to optimize away nop-insns
that snuck through the optimizers.   I once worked on a port that
couldn't encode a register self copy.  So when one snuck through the
optimizers, we had to deal with it in the output code and I know I've
seen other instances where ports tried to compensate for nop-insns that
snuck through.

So in the end I don't think you can assume that any given insn advances
the PC.  The closest we have is the length attribute, but it has always
supposed to have been conservatively correct for the purposes of branch
shortening.  ie, it can never return a length less than the actual
length, but it is allowed to return a length longer than the actual length.

Jeff

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Alexandre Oliva

On Feb  9, 2018, Alexandre Oliva  wrote:

> On Feb  9, 2018, Jeff Law  wrote:
>> On 02/08/2018 08:53 PM, Alan Modra wrote:
>>> On Fri, Feb 09, 2018 at 01:21:27AM -0200, Alexandre Oliva wrote:
 Here's what I checked in, right after the LVU patch.

 [IEPM] Introduce inline entry point markers
>>> 
>>> One of these two patches breaks ppc64le bootstrap with the assembler
>>> complaining "Error: view number mismatch" when compiling
>>> libdecnumber.
>>> 
>> I've just passed along a similar failure (.i, .s and command line
>> options) to Alex for ppc64 (be) building glibc.

> Thanks.  So, I'm told there are more such issues, that non-asm insn
> length attrs can't be relied on at this time to be nonzero only when the
> actual length is not zero.

I wonder...  In the previously-posted patch, we still regard call insns
are advancing PC.  I think that's a safe assumption, but...  are there
any other kinds of patterns we could recognize that would certainly
generate an actual PC-changing insn?  How about jump insns?  How about
insns that are SETs, or PARALLELs containing at least one SET?  Does
anyone see any risk in recognizing those when the length attr is,
conservatively or not, deemed unreliable?  Any other paterns we could
recognize to that end?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: PR84300, ICE in dwarf2cfi on ppc64le

2018-02-09 Thread Alan Modra

On Fri, Feb 09, 2018 at 08:11:44AM -0600, Segher Boessenkool wrote:
> On Fri, Feb 09, 2018 at 04:12:47PM +1030, Alan Modra wrote:
> >  ;; Use r0 to stop regrename twiddling with lr restore insns emitted
> >  ;; after the call to __morestack.
> >  (define_insn "split_stack_return"
> > -  [(unspec_volatile [(use (reg:SI 0))] UNSPECV_SPLIT_STACK_RETURN)]
> > +  [(unspec_volatile [(use (reg:SI 0)) (use (reg:SI LR_REGNO))]
> > +   UNSPECV_SPLIT_STACK_RETURN)]
> 
> I'm not sure what a USE as input of an UNSPEC means -- it should work
> without the USEs?

Hmm, yes, plain [(reg:SI 0) (reg:SI LR_REGNO)] ought to work.  A sniff
test says it's OK but I'll do the whole bootstrap/regtest cycle before
committing.  I'm not sure why I put the USE there for r0, probably
because I had the r0 dependency outside the unspec initially then
decided it could replace (const_int 0) inside the unspec vector to
save on useless RTL.  I didn't go far enough in trimming the RTL..

-- 
Alan Modra
Australia Development Lab, IBM

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Joseph Myers

On Fri, 9 Feb 2018, Joseph Myers wrote:

> I'm seeing regressions from my glibc bot for all of arm, mips, s390 and 
> sh.
> 
> https://sourceware.org/ml/libc-testresults/2018-q1/msg00283.html
> 
> arm and mips are "view number mismatch" building glibc and s390 is the 
> same error but building libgcc, so presumably those are the present issue.  
> sh is GCC segfaults building glibc, so it's less clearly associated with 
> this patch (but GCC mainline r257539 is bad and r257480 is good, in any 
> case).

sh4 is:

during RTL pass: final
In file included from strtof_l.c:45:
strtod_l.c: In function 'strtof_l_internal':
strtod_l.c:1769:1: internal compiler error: Segmentation fault
 }
 ^
0xb98e3f crash_signal
/scratch/jmyers/glibc-bot/src/gcc/gcc/toplev.c:325
0x856b18 maybe_output_next_view
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:1726
0x856b18 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2714
0xef36bc print_slot
/scratch/jmyers/glibc-bot/src/gcc/gcc/config/sh/sh.c:2557
0xef6520 output_far_jump(rtx_insn*, rtx_def*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/config/sh/sh.c:2617
0x857098 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:3101
0x856c26 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2806
0x858469 final_1
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2088
0x8591d8 rest_of_handle_final
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:4637
0x8591d8 execute
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:4711

Since it's in maybe_output_next_view, I think it probably comes from these 
patches.

> (It's possible more failures might appear on other architectures once the 
> bot builds the glibc testsuite, that's just failures from building GCC and 
> glibc.)

hppa has a similar ICE building glibc tests, also in 
maybe_output_next_view, so probably also from these patches:

during RTL pass: final
test-tgmath2.c: In function 'test_fma_3':
test-tgmath2.c:421:1: internal compiler error: Segmentation fault
 }
 ^
0xb7525f crash_signal
/scratch/jmyers/glibc-bot/src/gcc/gcc/toplev.c:325
0x835f14 maybe_output_next_view
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:1726
0x835f14 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2714
0xed7761 pa_output_call(rtx_insn*, rtx_def*, int)
/scratch/jmyers/glibc-bot/src/gcc/gcc/config/pa/pa.c:8019
0x83642b final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:3101
0x836026 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2806
0x837629 final_1
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:2088
0x8389d6 rest_of_handle_final
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:4637
0x8389d6 execute
/scratch/jmyers/glibc-bot/src/gcc/gcc/final.c:4711

-- 
Joseph S. Myers
jos...@codesourcery.com

[Fwd: [PATCH][Bug target/84266] mmintrin.h intrinsic headers for PowerPC code fails on power9]

2018-02-09 Thread Steven Munroe

--- Begin Message ---
This has a simple fix that I have tested on power8 and Seurer are
tested on power9.

While there may be a more elegent coding for the require casts, this is
the simplest change, considering the current stage.

2018-02-09  Steven Munroe  

* config/rs6000/mmintrin.h (_mm_cmpeq_pi32 [_ARCH_PWR9]):
Cast vec_cmpeq result to correct type.
* config/rs6000/mmintrin.h (_mm_cmpgt_pi32 [_ARCH_PWR9]):
Cast vec_cmpgt result to correct type.

Index: gcc/config/rs6000/mmintrin.h
===
--- gcc/config/rs6000/mmintrin.h(revision 257533)
+++ gcc/config/rs6000/mmintrin.h(working copy)
@@ -854,7 +854,7 @@
 
   a = (__vector signed int)vec_splats (__m1);
   b = (__vector signed int)vec_splats (__m2);
-  c = (__vector signed short)vec_cmpeq (a, b);
+  c = (__vector signed int)vec_cmpeq (a, b);
   return (__builtin_unpack_vector_int128 ((__vector __int128_t)c, 0));
 #else
   __m64_union m1, m2, res;
@@ -883,7 +883,7 @@
 
   a = (__vector signed int)vec_splats (__m1);
   b = (__vector signed int)vec_splats (__m2);
-  c = (__vector signed short)vec_cmpgt (a, b);
+  c = (__vector signed int)vec_cmpgt (a, b);
   return (__builtin_unpack_vector_int128 ((__vector __int128_t)c, 0));
 #else
   __m64_union m1, m2, res;

ready to commit?--- End Message ---

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Alexandre Oliva

On Feb  9, 2018, Jakub Jelinek  wrote:

> On Fri, Feb 09, 2018 at 07:01:25PM -0200, Alexandre Oliva wrote:
>> So, as discussed on IRC, I'm trying to use a target hook to allow
>> targets to indicate that their length attrs have been assessed for this
>> purpose, and a param to make that overridable, but I'm having trouble
>> initializing the param from the target hook.  How does one do that?

> Better in the default version of the target hook check the param
> whether it should return true or false, and for analyzed targets
> just use an always true (or false, depending on what the hook is)
> as the hook.

I want it to be overridable, so here's what I ended up with.
Testing underway; ok to install if it succeeds?


We need accurate length information to be able to optimize away
locviews that would contain only zero-numbered views, and even to
output correct view numbers.  E.g., if we assume an insn has length
nonzero, i.e., that it will advance PC, we conclude the view number
after it is zero.  If the assumption turns out to be false, the
assembler view zero assert will catch the mistake, but only if we're
using a view-capable assembler.  Otherwise, view numbers will silently
go out of sync between the compiler and the assembler, and thus the
debug info consumer.

Unfortunately, a number of targets seem to fail the essential property
for this to work, namely, that get_min_attr_length() be zero for every
insn that might have length zero.

We thus introduce a target hook that indicates whether we can rely on
the length attribute for this purpose, and a param that can be used to
override it, for testing purposes or to work around errors.

for  gcc/ChangeLog

* params.def (PARAM_USE_ATTR_LENGTH_IN_VIEW_COUNTING): New.
* target.def (attr_length_reliable_for_view_count): New.
* dwarf2out.c: Include params.h.
(unreliable_attr_length_p): New.
(dwarf2out_var_location): Use it.
* doc/tm.texi.in (TARGET_ATTR_LENGTH_RELIABLE_FOR_VIEW_COUNT):
New.
* doc/tm.texi: Rebuilt.
---
 gcc/doc/tm.texi|   13 +
 gcc/doc/tm.texi.in |2 ++
 gcc/dwarf2out.c|   28 +++-
 gcc/params.def |5 +
 gcc/target.def |   14 ++
 5 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index ddf48cb4b4d2..50fa0d387f32 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9986,6 +9986,19 @@ following it should not be run.  Usually true only for 
virtual assembler
 targets.
 @end deftypevr
 
+@deftypevr {Target Hook} bool TARGET_ATTR_LENGTH_RELIABLE_FOR_VIEW_COUNT
+TRUE if get_min_attr_length returns
+zero for any non-asm insn that might have length zero, during final.
+It's ok if it's conservative and always returns zero.
+If, in addition to the essential property, when an insn is known to have
+nonzero length, get_min_attr_length returns a positive number, that will
+enable loclist optimizations.  There is little point in making this TRUE
+otherwise.  Enabling this when the essential property is not met while
+using GNU as 2.30 or newer may cause the assembler to detect the errors
+and fail to assemble; other assemblers will silently let the errors
+through, with incorrect view numbers in debug information.
+@end deftypevr
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, 
@var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 0aab45f4992c..26b32db77f74 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6929,6 +6929,8 @@ tables, and hence is desirable if it works.
 
 @hook TARGET_NO_REGISTER_ALLOCATION
 
+@hook TARGET_ATTR_LENGTH_RELIABLE_FOR_VIEW_COUNT
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, 
@var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 749c7e3b9bbc..c48c117c8a16 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -96,6 +96,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
+#include "params.h"
 
 static void dwarf2out_source_line (unsigned int, unsigned int, const char *,
   int, bool);
@@ -26860,6 +26861,30 @@ dwarf2out_next_real_insn (rtx_insn *loc_note)
   return next_real;
 }
 
+/* Return TRUE if we should NOT use ATTR_length to optimize locview
+   lists, and FALSE otherwise.
+
+   If ATTR_length is always zero for insns that could have length
+   zero, we can use it, then we'll know when we're guaranteed to have
+   a PC change that implies a view reset.  locview lists whose views
+   are all zero can be optimized out completely.
+
+

Re: RFA: Fix PR 68028: LTO error when compiling PowerPC binaries with single precision floating point

2018-02-09 Thread Segher Boessenkool

Hi!

On Fri, Feb 09, 2018 at 09:37:28AM +, Nick Clifton wrote:
> > I thought you were going to do a patch like the following, to make the
> > e500 cores less special (they are not):
> 
> Sorry - my bad.  I defer to your patch then.  Whatever it takes to get
> the BZ fixed and off the books... :-)

So does it work?  :-)  (I cannot reproduce the problem, there must be
something in your configuration I'm overlooking).


Segher


[ Here it is again, with changelog this time: ]


>From 44c3b661ef75e57415b10ca2cec5ac6427bbed8f Mon Sep 17 00:00:00 2001
Message-Id: 
<44c3b661ef75e57415b10ca2cec5ac6427bbed8f.1518109128.git.seg...@kernel.crashing.org>
From: Segher Boessenkool 
Date: Thu, 8 Feb 2018 16:58:33 +
Subject: [PATCH] Do not mess with rs6000_{single,double}_float (PR68028)

For e500 family cores we do some questionable things with those flags,
which does not work with LTO.  So don't.


2018-02-10  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
handle rs6000_single_float and rs6000_double_float specially for
e500 family CPUs.

---
 gcc/config/rs6000/rs6000.c | 19 ---
 1 file changed, 19 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5adccd2..73de617 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4809,25 +4809,6 @@ rs6000_option_override_internal (bool global_init_p)
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
 rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
 
-  /* For the E500 family of cores, reset the single/double FP flags to let us
- check that they remain constant across attributes or pragmas.  */
-
-  switch (rs6000_cpu)
-{
-case PROCESSOR_PPC8540:
-case PROCESSOR_PPC8548:
-case PROCESSOR_PPCE500MC:
-case PROCESSOR_PPCE500MC64:
-case PROCESSOR_PPCE5500:
-case PROCESSOR_PPCE6500:
-  rs6000_single_float = 0;
-  rs6000_double_float = 0;
-  break;
-
-default:
-  break;
-}
-
   if (main_target_opt)
 {
   if (main_target_opt->x_rs6000_single_float != rs6000_single_float)
-- 
1.8.3.1

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-09 Thread Martin Sebor


On 02/09/2018 12:52 PM, Jason Merrill wrote:

On 02/08/2018 04:52 PM, Martin Sebor wrote:

I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
that's the right way to get the primary from a TEMPLATE_DECL.


Yes.


Attached is an updated patch.  It hasn't gone through full
testing yet but please let me know if you'd like me to make
some changes.



+  const char* const whitelist[] = {
+"error", "noreturn", "warning"
+  };


Why whitelist noreturn?  I would expect to want that to be consistent.


I expect noreturn to be used on a primary whose definition
is provided but that's not meant to be used the way the API
is otherwise expected to be.  As in:

   template 
   T [[noreturn]] foo () { throw "not implemented"; }

   template <> int foo();   // implemented elsewhere


Marking that template as noreturn seems pointless, and possibly harmful;
the deprecated, warning, or error attributes would be better for this
situation.


I meant either:

  template 
  T __attribute__ ((noreturn)) foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

or (sigh)

  template 
  [[noreturn]] T foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

It lets code like this

  int bar ()
  {
 return foo();
  }

be diagnosed because it's likely a bug (as Clang does with
-Wunreachable-code).  It doesn't stop code like the following
from compiling (which is good) but it instead lets them throw
at runtime which is what foo's author wants.

  void bar ()
  {
 foo();
  }

It's the same as having an "unimplemented" base virtual function
throw an exception when it's called rather than making it pure
and having calls to it abort.  Declaring the base virtual function
noreturn is useful for the same reason (and also diagnosed by
Clang).  I should remember to add the same warning in GCC 9.

I actually had some misgivings about both warning and deprecated
for the white-listing, but not for noreturn.  My (only mild)
concern is that both warning and deprecated functions can and
likely will in some cases still be called, and so using them to
suppress the warning runs the risk that their calls might be
wrong and no one will notice.  Warning cannot be suppressed
so it seems unlikely to be ignored, but deprecated can be.
So I wonder if the white-listing for deprecated should be
conditional on -Wdeprecated being enabled.


-  /* Merge the type qualifiers.  */
-  if (TREE_READONLY (newdecl))
-TREE_READONLY (olddecl) = 1;
   if (TREE_THIS_VOLATILE (newdecl))
 TREE_THIS_VOLATILE (olddecl) = 1;
-  if (TREE_NOTHROW (newdecl))
-TREE_NOTHROW (olddecl) = 1;
+
+  if (merge_attr)
+{
+  /* Merge the type qualifiers.  */
+  if (TREE_READONLY (newdecl))
+TREE_READONLY (olddecl) = 1;
+}
+  else
+{
+  /* Set the bits that correspond to the const function
attributes.  */
+  TREE_READONLY (olddecl) = TREE_READONLY (newdecl);
+}


Let's limit the const/volatile handling here to non-functions, and
handle the const/noreturn attributes for functions in the later hunk
along with nothrow/malloc/pure.


I had to keep the TREE_HAS_VOLATILE handling as is since it
applies to functions too (has side-effects).  Otherwise the
attr-nothrow-2.C test fails.



I try to avoid reformatting code that I'm not actually changing, to
avoid noise in e.g. git blame.


Okay.

Attached is an updated patch.  Besides the changes above
I also used a plural "attributes" in the name of the option
for consistency with other attribute options, expanded
the documentation, and enhanced one of the tests.

Bootstrapped on x86_64-linux (tests still running).

Martin
PR c++/83871 - wrong code for attribute const and pure on distinct template specializations
PR c++/83503 - [8 Regression] bogus -Wattributes for const and pure on function template specialization

gcc/ChangeLog:

	PR c++/83871
	* gcc/doc/invoke.texi (-Wmissing-attribute): New option.

gcc/c-family/ChangeLog:

	PR c++/83871
	* c.opt (-Wmissing-attribute): New option.

gcc/cp/ChangeLog:

	PR c++/83871
	PR c++/83503
	* cp-tree.h (warn_spec_missing_attributes): New function.
	((check_explicit_specialization): Add an argument.  Call the above
	function.
	* decl.c (duplicate_decls): Avoid applying primary function template's
	attributes to its explicit specializations.

gcc/testsuite/ChangeLog:

	PR c++/83871
	PR c++/83503
	* g++.dg/ext/attr-const-pure.C: New test.
	* g++.dg/ext/attr-malloc.C: New test.
	* g++.dg/ext/attr-nonnull.C: New test.
	* g++.dg/ext/attr-nothrow.C: New test.
	* g++.dg/ext/attr-nothrow-2.C: New test.
	* g++.dg/ext/attr-returns-nonnull.C: New test.
	* g++.dg/Wmissing-attribute.C: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 9c71726..1039274 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -781,6 +781,11 @@ Wtemplates
 C++ ObjC++ Var(warn_templates) Warning
 Warn on primary template declaration.
 
+Wmissing-attributes
+C ObjC C+

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Jakub Jelinek

On Fri, Feb 09, 2018 at 07:01:25PM -0200, Alexandre Oliva wrote:
> So, as discussed on IRC, I'm trying to use a target hook to allow
> targets to indicate that their length attrs have been assessed for this
> purpose, and a param to make that overridable, but I'm having trouble
> initializing the param from the target hook.  How does one do that?

Better in the default version of the target hook check the param
whether it should return true or false, and for analyzed targets
just use an always true (or false, depending on what the hook is)
as the hook.

> By disabling it altogether, we won't get the assembler checks or
> incorrect view numbers in debug info, but we will get plenty of all-zero
> locview lists.  Oh well...  I guess at this point that's better than
> wrong debug info or assembler failures.

For the debugging of the target issues, you can also use a hack:
in final.c for instructions that have minimum length longer than zero
emit a label right before emitting the insn and after it:
.Lhacke1:
.if (.Lhacke1 - .Lhackb1) == 0
.error "zero length"
.endif
(or perhaps just:
.if (. - .Lhack1) == 0
.error "zero length"
.endif
), then make check various targets with that.

Jakub

[PATCH] Improve pow (C, x) -> exp (log (C) * x) optimization (PR middle-end/84309)

2018-02-09 Thread Jakub Jelinek

Hi!

Apparently the new pow(C,x) -> exp(log(C)*x) if C > 0 optimization
breaks some package (Marek should know which), as it has 7ulp error.
Generally one should be prepared for some errors with -ffast-math.

Though, in this case, if the target has c99 runtime and C is
a positive 0x1pNN it seems much better to use exp2 over exp, for
C being 2 pow (2, x) is optimized into exp2 (x) and even for other
values log2(C) will still be some positive or negative integer, so
in many cases there won't be any rounding errors in the multiplication.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Perhaps we should do something similar if C is a power of 10 (use exp10
and log10).

2018-02-10  Jakub Jelinek  

PR middle-end/84309
* match.pd (pow(C,x) -> exp(log(C)*x)): Optimize instead into
exp2(log2(C)*x) if C is a power of 2 and c99 runtime is available.

* gcc.dg/pr84309.c: New test.
 
--- gcc/match.pd.jj 2018-01-26 12:43:23.208922420 +0100
+++ gcc/match.pd2018-02-09 18:48:26.412021408 +0100
@@ -3992,15 +3992,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(logs (pows @0 @1))
(mult @1 (logs @0
 
- /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
+ /* pow(C,x) -> exp(log(C)*x) if C > 0,
+or if C is a positive power of 2,
+pow(C,x) -> exp2(log2(C)*x).  */
  (for pows (POW)
   exps (EXP)
   logs (LOG)
+  exp2s (EXP2)
+  log2s (LOG2)
   (simplify
(pows REAL_CST@0 @1)
-(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0)
-&& real_isfinite (TREE_REAL_CST_PTR (@0)))
- (exps (mult (logs @0) @1)
+   (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0)
+   && real_isfinite (TREE_REAL_CST_PTR (@0)))
+(with {
+   const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (@0);
+   bool use_exp2 = false;
+   if (targetm.libc_has_function (function_c99_misc)
+  && value->cl == rvc_normal)
+{
+  REAL_VALUE_TYPE frac_rvt = *value;
+  SET_REAL_EXP (&frac_rvt, 1);
+  if (real_equal (&frac_rvt, &dconst1))
+use_exp2 = true;
+}
+ }
+ (if (use_exp2)
+   (exp2s (mult (log2s @0) @1))
+   (exps (mult (logs @0) @1)))
 
  (for sqrts (SQRT)
   cbrts (CBRT)
--- gcc/testsuite/gcc.dg/pr84309.c.jj   2018-02-09 18:54:52.254787678 +0100
+++ gcc/testsuite/gcc.dg/pr84309.c  2018-02-09 18:59:02.343636178 +0100
@@ -0,0 +1,14 @@
+/* PR middle-end/84309 */
+/* { dg-do run { target c99_runtime } } */
+/* { dg-options "-O2 -ffast-math" } */
+
+int
+main ()
+{
+  unsigned long a = 1024;
+  unsigned long b = 16 * 1024;
+  unsigned long c = __builtin_pow (2, (__builtin_log2 (a) + __builtin_log2 
(b)) / 2);
+  if (c != 4096)
+__builtin_abort ();
+  return 0;
+}

Jakub

[committed] Better fix for the -fsanitize=undefined -fopemp -flto ICEs (PR sanitizer/83987)

2018-02-09 Thread Jakub Jelinek

Hi!

Apparently the cp_free_lang_data fix wasn't sufficient, the
DECL_OMP_PRIVATIZED_MEMBER vars can be copied to child functions
but without the lang specific data so it doesn't trigger any longer,
and even trying to free it in cp_free_lang_data if it satisfies
omp_member_access_dummy_var doesn't work.  So, this patch instead
looks for these variables at the end of the omp lowering phase
in methods, and removes them from GIMPLE_BIND bind vars as well as
BLOCK_VARS in the block tree.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2018-02-10  Jakub Jelinek  

PR sanitizer/83987
* omp-low.c (maybe_remove_omp_member_access_dummy_vars,
remove_member_access_dummy_vars): New functions.
(lower_omp_for, lower_omp_taskreg, lower_omp_target,
lower_omp_1, execute_lower_omp): Use them.

* tree.c (cp_free_lang_data): Revert 2018-01-23 change.

* g++.dg/ubsan/pr83987-2.C: New test.

--- gcc/omp-low.c.jj2018-01-24 17:26:25.216357997 +0100
+++ gcc/omp-low.c   2018-02-09 15:26:25.653403123 +0100
@@ -3208,6 +3208,43 @@ scan_omp (gimple_seq *body_p, omp_contex
 
 /* Re-gimplification and code generation routines.  */
 
+/* Remove omp_member_access_dummy_var variables from gimple_bind_vars
+   of BIND if in a method.  */
+
+static void
+maybe_remove_omp_member_access_dummy_vars (gbind *bind)
+{
+  if (DECL_ARGUMENTS (current_function_decl)
+  && DECL_ARTIFICIAL (DECL_ARGUMENTS (current_function_decl))
+  && (TREE_CODE (TREE_TYPE (DECL_ARGUMENTS (current_function_decl)))
+ == POINTER_TYPE))
+{
+  tree vars = gimple_bind_vars (bind);
+  for (tree *pvar = &vars; *pvar; )
+   if (omp_member_access_dummy_var (*pvar))
+ *pvar = DECL_CHAIN (*pvar);
+   else
+ pvar = &DECL_CHAIN (*pvar);
+  gimple_bind_set_vars (bind, vars);
+}
+}
+
+/* Remove omp_member_access_dummy_var variables from BLOCK_VARS of
+   block and its subblocks.  */
+
+static void
+remove_member_access_dummy_vars (tree block)
+{
+  for (tree *pvar = &BLOCK_VARS (block); *pvar; )
+if (omp_member_access_dummy_var (*pvar))
+  *pvar = DECL_CHAIN (*pvar);
+else
+  pvar = &DECL_CHAIN (*pvar);
+
+  for (block = BLOCK_SUBBLOCKS (block); block; block = BLOCK_CHAIN (block))
+remove_member_access_dummy_vars (block);
+}
+
 /* If a context was created for STMT when it was scanned, return it.  */
 
 static omp_context *
@@ -6961,6 +6998,7 @@ lower_omp_for (gimple_stmt_iterator *gsi
   pop_gimplify_context (new_stmt);
 
   gimple_bind_append_vars (new_stmt, ctx->block_vars);
+  maybe_remove_omp_member_access_dummy_vars (new_stmt);
   BLOCK_VARS (block) = gimple_bind_vars (new_stmt);
   if (BLOCK_VARS (block))
 TREE_USED (block) = 1;
@@ -7413,6 +7451,7 @@ lower_omp_taskreg (gimple_stmt_iterator
   /* Declare all the variables created by mapping and the variables
  declared in the scope of the parallel body.  */
   record_vars_into (ctx->block_vars, child_fn);
+  maybe_remove_omp_member_access_dummy_vars (par_bind);
   record_vars_into (gimple_bind_vars (par_bind), child_fn);
 
   if (ctx->record_type)
@@ -7781,6 +7820,7 @@ lower_omp_target (gimple_stmt_iterator *
   /* Declare all the variables created by mapping and the variables
 declared in the scope of the target body.  */
   record_vars_into (ctx->block_vars, child_fn);
+  maybe_remove_omp_member_access_dummy_vars (tgt_bind);
   record_vars_into (gimple_bind_vars (tgt_bind), child_fn);
 }
 
@@ -8772,6 +8812,7 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p
   break;
 case GIMPLE_BIND:
   lower_omp (gimple_bind_body_ptr (as_a  (stmt)), ctx);
+  maybe_remove_omp_member_access_dummy_vars (as_a  (stmt));
   break;
 case GIMPLE_OMP_PARALLEL:
 case GIMPLE_OMP_TASK:
@@ -8976,6 +9017,16 @@ execute_lower_omp (void)
   all_contexts = NULL;
 }
   BITMAP_FREE (task_shared_vars);
+
+  /* If current function is a method, remove artificial dummy VAR_DECL created
+ for non-static data member privatization, they aren't needed for
+ debuginfo nor anything else, have been already replaced everywhere in the
+ IL and cause problems with LTO.  */
+  if (DECL_ARGUMENTS (current_function_decl)
+  && DECL_ARTIFICIAL (DECL_ARGUMENTS (current_function_decl))
+  && (TREE_CODE (TREE_TYPE (DECL_ARGUMENTS (current_function_decl)))
+ == POINTER_TYPE))
+remove_member_access_dummy_vars (DECL_INITIAL (current_function_decl));
   return 0;
 }
 
--- gcc/cp/tree.c.jj2018-02-09 06:44:24.855812073 +0100
+++ gcc/cp/tree.c   2018-02-09 13:44:22.938670431 +0100
@@ -5273,16 +5273,6 @@ cp_free_lang_data (tree t)
 /* We do not need the leftover chaining of namespaces from the
binding level.  */
 DECL_CHAIN (t) = NULL_TREE;
-  /* Set DECL_VALUE_EXPRs of OpenMP privatized member artificial
- decls to error_mark_node.  These are DECL_IGNORED_P and after
- OpenMP lowering they

Re: [PATCH] Character length cleanup for Coarray Fortran library

2018-02-09 Thread Steve Kargl

On Fri, Feb 09, 2018 at 06:10:46PM +0200, Janne Blomqvist wrote:
> Following the change to use size_t for Fortran character lengths (PR
> 78534), this patch modifies the Coarray ABI in a similar way. The
> single-image implementation that is included in libgfortran is
> updated, but this needs corresponding work in the OpenCoarray library
> as well for multi-image support. Damian, can you look into that?
> 
> Regtested on x86_64-pc-linux-gnu, Ok for trunk?
> 

Janne, I've scanned the patch and it appears to a
fairly straight foward mechanical s/int/size_t
(and similar for nodes).  OK to commit (although
you may want to wait for a response from Damain
or Izaak Beekman).

As this breaks an ABI and we've broken it for 8.0,
I think this should go into trunk.  You may need
to ping Jakub or Richard for final approval.

-- 
Steve

Re: [Patch, Fortran] PR 84273: Reject allocatable passed-object dummy argument (proc_ptr_47.f90)

2018-02-09 Thread Steve Kargl

On Fri, Feb 09, 2018 at 06:13:34PM +0100, Janus Weil wrote:
> 
> the attached patch fixes some checking code for PASS arguments in
> procedure-pointer components, which does not properly account for the
> fact that the PASS argument needs to be polymorphic.
> 
> [The reason for this issue is probably that PPCs were mostly
> implemented before polymorphism was available. The corresponding
> pass-arg checks for TBPs are ok.]
> 
> The patch also fixes an invalid test case (which was detected thanks
> to Neil Carlson). It regtests cleanly on x86_64-linux-gnu. Ok for
> trunk?

Janus,

The patch looks ok to me.  Trunk is in regression and doc
fixes only mode, so you'll probably need to ping Jakub or
Richard (ie., release engineer) for an ok. 

PS: Welcome back!  We can definitely use your talent.

-- 
Steve

Re: [PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-09 Thread Segher Boessenkool

Hi Carl,

On Fri, Feb 09, 2018 at 08:22:57AM -0800, Carl Love wrote:
> The following patch contains fixes for the builtins4-runnable.c test in
> 32-bit mode for the current GCC 8 branch.  This is issue #290 in Bill
> Schmidt's issues.  The int128 and uint128 variable types are not
> supported in 32-bit mode.  The tests were moved to a new test file and
> restricted to run only when int128 type is supported.

> 2018-02-09  Carl Love  
> 
>   * gcc.target/powerpc/builtins-4-runnable.c (main): Move 
> int128 and
>   uint128 tests to new testfile.
>   * gcc.target/powerpc/builtins-4-int128-runnable.c: New testfile
> for
>   int128 and uint128 tests.
>   * gcc.target/powerpc/powerpc.exp: Add builtins-4-int128-
> runnable.c to
>   list oftorture tests.

Missing space (and the layout is messed up a bit, that's your mailer
I guess -- the patch is wrapped as well, that's not good).

Okay for trunk.  Thanks!


Segher

Re: [RFC][PATCH v2] PR preprocessor/83173: Additional check before decrementing highest_location

2018-02-09 Thread Mike Gulick

Hi David,

Here is a new version of the linemap patch (see my earlier emails for an updated
version of the test code).  I spent several days poring over the line map code,
and I think I understand it a little better now.  I also discovered
-fdump-internal-locations, and it was a big help.  If I use this switch with the
code from the test I posted with an initial offset of 0x6001, I get the
following (I have removed some unnecessary fields for brevity):

ORDINARY MAP: 21
  source_location interval: 1610612805 <= loc < 1610612809
  file: location-overflow-test-pr83173.c
  starting at line: 3
  reason: 2 (LC_RENAME)
location-overflow-test-pr83173.c:  3|loc:1610612805|  { dg-do preprocess }
location-overflow-test-pr83173.c:  4|loc:1610612806|*/
location-overflow-test-pr83173.c:  5|loc:1610612807|
location-overflow-test-pr83173.c:  6|loc:1610612808|#include 
"location-overflow-test-pr83173.h"

ORDINARY MAP: 22
  source_location interval: 1610612809 <= loc < 1610612810
  file: location-overflow-test-pr83173.h
  starting at line: 1
  reason: 0 (LC_ENTER)
location-overflow-test-pr83173.h:  1|loc:1610612809|#include 
"location-overflow-test-pr83173-1.h"

ORDINARY MAP: 23
  source_location interval: 1610612810 <= loc < 1610612812
  file: location-overflow-test-pr83173-1.h
  starting at line: 1
  reason: 0 (LC_ENTER)
location-overflow-test-pr83173-1.h:  1|loc:1610612810|#pragma once
location-overflow-test-pr83173-1.h:  2|loc:1610612811|#define 
LOCATION_OVERFLOW_TEST_PR83173_1_H

ORDINARY MAP: 24
  source_location interval: 1610612812 <= loc < 1610612812
  file: location-overflow-test-pr83173.h
  starting at line: 2
  reason: 1 (LC_LEAVE)

ORDINARY MAP: 25
  source_location interval: 1610612812 <= loc < 1610612814
  file: location-overflow-test-pr83173-2.h
  starting at line: 1
  reason: 0 (LC_ENTER)
location-overflow-test-pr83173-2.h:  1|loc:1610612812|#pragma once
location-overflow-test-pr83173-2.h:  2|loc:1610612813|#define 
LOCATION_OVERFLOW_TEST_PR83173_2_H

ORDINARY MAP: 26
  source_location interval: 1610612814 <= loc < 1610612815
  file: location-overflow-test-pr83173.h
  starting at line: 2
  reason: 1 (LC_LEAVE)
location-overflow-test-pr83173.h:  2|loc:1610612814|#include 
"location-overflow-test-pr83173-2.h"

ORDINARY MAP: 27
  source_location interval: 1610612815 <= loc < 1610612823
  file: location-overflow-test-pr83173.c
  starting at line: 7
  reason: 1 (LC_LEAVE)

ORDINARY MAP: 28
  source_location interval: 1610612823 <= loc < 1610612825
  file: location-overflow-test-pr83173.c
  starting at line: 15
  reason: 2 (LC_RENAME)

Notice that map 24 and 25 have the same start_location.  This is because
line_table->highest_location is being decremented when it should not be.  This
new patch uses a more robust method to check whether the source line that
highest_location refers to is greater than loc (which is the source_location of
the #include).  Instead of comparing the source_location values directly, I use
linemap_get_expansion_line to map the source location to a line number, and then
compare the line numbers.  This handles the case that loc is an adhoc location.
I noticed that after updating the filenames in the test to be
'location-overflow-test-pr83173-1.h', the source locations passed into
_cpp_stack_include were adhoc locations when I wasn't using the location offset
plugin.  This is because these filenames are now 34 characters long, which is
longer than the 5 range bits that an ordinary source_location allows.

I spent a lot of time comparing the -fdump-internal-locations output when this
patch is in use to the unpatched version.  The only differences occur when a
#include is the last line in the file.  For the case where the source location >
LINE_MAP_MAX_SOURCE_LOCATION, the ordinary map 25 (see above) now starts at
1610612813 instead of 1610612812.  For the case where the source location <
LINE_MAP_MAX_LOCATION_WITH_COLS, the start location of that map has increased by
32 (which is effectively one column).  For the case where the source location is
between LINE_MAP_MAX_LOCATION_WITH_COLS and LINE_MAP_MAX_SOURCE_LOCATION (so
that there are no range bits), the output was unchanged.  I can post these
location map dumps if you would like to see them.

There are no regressions in the gcc test suite from this patch.

The other addition in this version if the patch is a small improvement to the
-fdump-internal-locations output to include the reason and included_from fields
from the line_map_ordinary data structure.  I aos fixed an issue with the output
alignment.  I found this useful when evaluating these changes.

Thanks,
Mike
From e3f23f66ad18598fc21a9223b7f29572a14d1588 Mon Sep 17 00:00:00 2001
From: Mike Gulick 
Date: Fri, 1 Dec 2017 09:43:22 -0500
Subject: [PATCH v2] PR preprocessor/83173: Additional check before
 decrementing highest_location

2017-12-01  Mike Gulick  

	PR preprocessor/83173
	* libcpp/files.c (_cpp_stack_include): Check if
	line_table->highest_location is past current lin

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Joseph Myers

I'm seeing regressions from my glibc bot for all of arm, mips, s390 and 
sh.

https://sourceware.org/ml/libc-testresults/2018-q1/msg00283.html

arm and mips are "view number mismatch" building glibc and s390 is the 
same error but building libgcc, so presumably those are the present issue.  
sh is GCC segfaults building glibc, so it's less clearly associated with 
this patch (but GCC mainline r257539 is bad and r257480 is good, in any 
case).

(It's possible more failures might appear on other architectures once the 
bot builds the glibc testsuite, that's just failures from building GCC and 
glibc.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-09 Thread Carl Love

GCC maintainers:

As pointed out, the dg arguments in new test file was missing the
{target 128}.  I updated the arguments to be 

{ dg-do run { target { int128 && powerpc64*-*-* } } }

Without the powerpc64*-*-* the test was still tried to compiled the
test case in 32-bit mode on BE and failed.

I reran the tests manually to verify the tests run as expected.  On a
Power 8 BE machine with GCC configured for -m32 and -m64 I ran:
 
make -k check-gcc RUNTESTFLAGS="--target_board=unix'{-m32,-m64}' 
powerpc.exp=builtins-4-int128-runnable.c" > out

The pertinent output was:

Test Run By carll on Fri Feb  9 16:42:34 2018   
   
Native configuration is powerpc64-unknown-linux-gnu 
 === gcc tests ===  
  

   
Schedule of variations: 
   
unix/-m32   
   
unix/-m64   
   

   
Running target unix/-m32
   
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
target. 
Using /usr/share/dejagnu/config/unix.exp as generic interface file for 
target. 
Using /home/carll/GCC/gcc-builtin4/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running 
/home/carll/GCC/gcc-builtin4/gcc/testsuite/gcc.target/powerpc/powerpc.exp ...   
   

   
  === gcc Summary for unix/-m32 === 
   

   
# of unsupported tests 6
   
Running target unix/-m64
   
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
target. 
Using /usr/share/dejagnu/config/unix.exp as generic interface file for 
target. 
Using /home/carll/GCC/gcc-builtin4/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running 
/home/carll/GCC/gcc-builtin4/gcc/testsuite/gcc.target/powerpc/powerpc.exp ...   
   

   
  === gcc Summary for unix/-m64 === 
   

   
# of expected passes12  
   

   
  === gcc Summary ===   
   

   
# of expected passes12  
   
# of unsupported tests 6 
/home/carll/GCC/build/gcc-builtin4/gcc/xgcc  version 8.0.1 20180209 
(experimental) (GCC) 

On a Power 8 LE machine I ran:
   
make -k check-gcc RUNTESTFLAGS="powerpc.exp=builtins-4-int128-runnable.c" > out 
  

The pertinent output was:

Test Run By carll on Fri Feb  9 15:56:29 2018

C++ PATCH for c++/81917, ICE with self-referential partial specialization

2018-02-09 Thread Jason Merrill

In this testcase we were managing to instantiate the same class twice,
leading to an ICE.  Fixed by setting TYPE_BEING_DEFINED before we
consider what partial specialization to use.  This also causes several
existing testcases to error about incomplete types rather than excess
template instantiation depth, which seems appropriate.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 46063f6cb79e4f37040a326a2da1378e0e98cb38
Author: Jason Merrill 
Date:   Fri Feb 9 14:26:03 2018 -0500

PR c++/81917 - ICE with void_t and partial specialization.

* pt.c (instantiate_class_template_1): Set TYPE_BEING_DEFINED before
calling most_specialized_partial_spec.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9c57709e7a7..281604594ad 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10347,14 +10347,14 @@ instantiate_class_template_1 (tree type)
   templ = most_general_template (CLASSTYPE_TI_TEMPLATE (type));
   gcc_assert (TREE_CODE (templ) == TEMPLATE_DECL);
 
+  /* Mark the type as in the process of being defined.  */
+  TYPE_BEING_DEFINED (type) = 1;
+
   /* Determine what specialization of the original template to
  instantiate.  */
   t = most_specialized_partial_spec (type, tf_warning_or_error);
   if (t == error_mark_node)
-{
-  TYPE_BEING_DEFINED (type) = 1;
-  return error_mark_node;
-}
+return error_mark_node;
   else if (t)
 {
   /* This TYPE is actually an instantiation of a partial
@@ -10379,16 +10379,16 @@ instantiate_class_template_1 (tree type)
   /* If the template we're instantiating is incomplete, then clearly
  there's nothing we can do.  */
   if (!COMPLETE_TYPE_P (pattern))
-return type;
+{
+  /* We can try again later.  */
+  TYPE_BEING_DEFINED (type) = 0;
+  return type;
+}
 
   /* If we've recursively instantiated too many templates, stop.  */
   if (! push_tinst_level (type))
 return type;
 
-  /* Now we're really doing the instantiation.  Mark the type as in
- the process of being defined.  */
-  TYPE_BEING_DEFINED (type) = 1;
-
   /* We may be in the middle of deferred access check.  Disable
  it now.  */
   push_deferring_access_checks (dk_no_deferred);
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-62.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-62.C
new file mode 100644
index 000..6f1fa4584ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-62.C
@@ -0,0 +1,22 @@
+// PR c++/81917
+// { dg-do compile { target c++11 } }
+
+template  using a = void;
+template  struct b
+{
+  typedef int c;
+};
+template  class b>;
+template ::c> class f;
+template  class g { };
+template  class h
+{
+  class i;
+  typedef g> j;
+  class i
+  {
+j k;   // { dg-error "incomplete" }
+  };
+};
+h H;
+
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-template2.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-template2.C
index 40e3075c162..0df0d4e89c3 100644
--- a/gcc/testsuite/g++.dg/cpp0x/initlist-template2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-template2.C
@@ -1,6 +1,5 @@
 // PR c++/71747
 // { dg-do compile { target c++11 } }
-// { dg-options -ftemplate-depth=20 }
 
 template < bool > struct A
 {
@@ -14,10 +13,8 @@ template < bool > struct A
 template < bool, typename = int > struct F; 
 template < bool X > 
 // should be: struct F < X, typename A < A < X > {} () >::type > 
-struct F < X, typename A < F < X > {} () >::type > // { dg-error "" }
+struct F < X, typename A < F < X > {} () >::type >
 {
 };
 
-F < true > f;
-
-// { dg-prune-output "compilation terminated" }
+F < true > f;  // { dg-error "" }
diff --git a/gcc/testsuite/g++.dg/template/crash125.C 
b/gcc/testsuite/g++.dg/template/crash125.C
index 448f74682d4..de41b99a927 100644
--- a/gcc/testsuite/g++.dg/template/crash125.C
+++ b/gcc/testsuite/g++.dg/template/crash125.C
@@ -13,6 +13,4 @@ struct TraitCheckImpl > {
 typedef void Complete;
 };
 
-Swappable s;  // { dg-error "depth" }
-
-// { dg-prune-output "compilation terminated" }
+Swappable s;  // { dg-error "" }
diff --git a/gcc/testsuite/g++.dg/template/pr51488.C 
b/gcc/testsuite/g++.dg/template/pr51488.C
index 4979a22787c..794a6cfe067 100644
--- a/gcc/testsuite/g++.dg/template/pr51488.C
+++ b/gcc/testsuite/g++.dg/template/pr51488.C
@@ -2,6 +2,4 @@
 
 template struct s;
 template struct s::a> {};
-s ca;  // { dg-error "depth" }
-
-// { dg-prune-output "compilation terminated" }
+s ca;  // { dg-error "" }
diff --git a/gcc/testsuite/g++.dg/template/pr55843.C 
b/gcc/testsuite/g++.dg/template/pr55843.C
index 467dd827218..04079ed1175 100644
--- a/gcc/testsuite/g++.dg/template/pr55843.C
+++ b/gcc/testsuite/g++.dg/template/pr55843.C
@@ -1,5 +1,3 @@
-// { dg-options "-ftemplate-depth-8" }
-
 template< typename T > struct type_wrapper {
 };
 typedef char (&yes_tag)[2];
@@ -7,11 +5,11 @@ template struct if_c {
 };
 template< typename T > struct has_type {
   struct gcc_3_2_wknd {
-template< typename U > static yes_tag test( type_wrapper cons

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Alexandre Oliva

On Feb  9, 2018, Jeff Law  wrote:

> On 02/08/2018 08:53 PM, Alan Modra wrote:
>> On Fri, Feb 09, 2018 at 01:21:27AM -0200, Alexandre Oliva wrote:
>>> Here's what I checked in, right after the LVU patch.
>>> 
>>> [IEPM] Introduce inline entry point markers
>> 
>> One of these two patches breaks ppc64le bootstrap with the assembler
>> complaining "Error: view number mismatch" when compiling
>> libdecnumber.
>> 
> I've just passed along a similar failure (.i, .s and command line
> options) to Alex for ppc64 (be) building glibc.

Thanks.  So, I'm told there are more such issues, that non-asm insn
length attrs can't be relied on at this time to be nonzero only when the
actual length is not zero.  When we fail that and regard a zero-length
insn as nonzero and that's all we have between two subsequent views, the
assembler (GNU as 2.30) will catch and report the error.  With other
assemblers, the incorrect view reset will go unnoticed and result in
incorrect debug info.

So, as discussed on IRC, I'm trying to use a target hook to allow
targets to indicate that their length attrs have been assessed for this
purpose, and a param to make that overridable, but I'm having trouble
initializing the param from the target hook.  How does one do that?

Meanwhile, here's the (WIP) patch that introduces the param defaulting
to no locview optimizations (which can only be performed at points in
which the compiler knows there are PC changes); I'd like it to default
to targetm.attr_length_reliable_for_view_count, but I couldn't figure
out how to do so.  Help?


By disabling it altogether, we won't get the assembler checks or
incorrect view numbers in debug info, but we will get plenty of all-zero
locview lists.  Oh well...  I guess at this point that's better than
wrong debug info or assembler failures.




disable locview optimizations for now

---
 gcc/doc/tm.texi|   13 +
 gcc/doc/tm.texi.in |2 ++
 gcc/dwarf2out.c|4 +++-
 gcc/opts.c |6 ++
 gcc/params.def |5 +
 gcc/target.def |   13 +
 6 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index ddf48cb4b4d2..50fa0d387f32 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9986,6 +9986,19 @@ following it should not be run.  Usually true only for 
virtual assembler
 targets.
 @end deftypevr
 
+@deftypevr {Target Hook} bool TARGET_ATTR_LENGTH_RELIABLE_FOR_VIEW_COUNT
+TRUE if get_min_attr_length returns
+zero for any non-asm insn that might have length zero, during final.
+It's ok if it's conservative and always returns zero.
+If, in addition to the essential property, when an insn is known to have
+nonzero length, get_min_attr_length returns a positive number, that will
+enable loclist optimizations.  There is little point in making this TRUE
+otherwise.  Enabling this when the essential property is not met while
+using GNU as 2.30 or newer may cause the assembler to detect the errors
+and fail to assemble; other assemblers will silently let the errors
+through, with incorrect view numbers in debug information.
+@end deftypevr
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, 
@var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 0aab45f4992c..26b32db77f74 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6929,6 +6929,8 @@ tables, and hence is desirable if it works.
 
 @hook TARGET_NO_REGISTER_ALLOCATION
 
+@hook TARGET_ATTR_LENGTH_RELIABLE_FOR_VIEW_COUNT
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, 
@var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 749c7e3b9bbc..6a6520a05e8b 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -96,6 +96,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
+#include "params.h"
 
 static void dwarf2out_source_line (unsigned int, unsigned int, const char *,
   int, bool);
@@ -26926,7 +26927,8 @@ dwarf2out_var_location (rtx_insn *loc_note)
   || GET_CODE (loc_note) == ASM_INPUT
   || asm_noperands (loc_note) >= 0)
;
-  else if (get_attr_min_length (loc_note) > 0)
+  else if (PARAM_VALUE (PARAM_USE_ATTR_LENGTH_IN_VIEW_COUNTING)
+  && get_attr_min_length (loc_note) > 0)
RESET_NEXT_VIEW (cur_line_info_table->view);
 
   return;
diff --git a/gcc/opts.c b/gcc/opts.c
index f2795f98bf44..d4b3065caad6 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1039,6 +1039,12 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
   if ((opts->x_flag_saniti

Re: [PATCH, rs6000] (v2) Update vsx-vector-6-le.c tests

2018-02-09 Thread Segher Boessenkool

Hi Will,

On Fri, Feb 09, 2018 at 09:44:28AM -0600, Will Schmidt wrote:
> Noted during review of test results on P9.   Due to changes and improvements,
> our codegen is different for this test on power9.
> Modified the existing test to target P8 and added a P9 variant with updated
> counts.
> 
> (v2) Updated/added dg-requires and dg-skip-if stanzas pre review and re-check
> of the test results.
> 
> Sniff-tested good, now runs clean on P9 where it did not before. Runs clean on
> P8LE, P8BE, P7.
> OK for trunk?

Yes, thanks!


Segher


> 2018-02-09  Will Schmidt  
> 
>   * gcc.target/powerpc/vsx-vector-6-le.c:  Update dg-* stanzas.
>   * gcc.target/powerpc/vsx-vector-6-le.p9.c:  New.

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Richard Biener

On February 9, 2018 7:07:45 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
>> >which indeed fixes the testcase and seems not to break asan.exp.
>> 
>> Huh. Need to double check why that makes sense ;) 
>
>I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
>is the second one, the first one is an integer argument with flags.
>And ASAN_MARK, both poison and unpoison, works kind like a clobber on
>the
>referenced variable, before unpoison it is generally inaccessible and
>after
>poison too.

Ah, indeed. 

Richard. 

>   Jakub

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-09 Thread Jason Merrill


On 02/08/2018 04:52 PM, Martin Sebor wrote:

I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
that's the right way to get the primary from a TEMPLATE_DECL.


Yes.


Attached is an updated patch.  It hasn't gone through full
testing yet but please let me know if you'd like me to make
some changes.



+  const char* const whitelist[] = {
+    "error", "noreturn", "warning"
+  };


Why whitelist noreturn?  I would expect to want that to be consistent.


I expect noreturn to be used on a primary whose definition
is provided but that's not meant to be used the way the API
is otherwise expected to be.  As in:

   template 
   T [[noreturn]] foo () { throw "not implemented"; }

   template <> int foo();   // implemented elsewhere


Marking that template as noreturn seems pointless, and possibly harmful; 
the deprecated, warning, or error attributes would be better for this 
situation.



-  /* Merge the type qualifiers.  */
-  if (TREE_READONLY (newdecl))
-   TREE_READONLY (olddecl) = 1;
   if (TREE_THIS_VOLATILE (newdecl))
TREE_THIS_VOLATILE (olddecl) = 1;
-  if (TREE_NOTHROW (newdecl))
-   TREE_NOTHROW (olddecl) = 1;
+
+  if (merge_attr)
+   {
+ /* Merge the type qualifiers.  */
+ if (TREE_READONLY (newdecl))
+   TREE_READONLY (olddecl) = 1;
+   }
+  else
+   {
+ /* Set the bits that correspond to the const function attributes.  */
+ TREE_READONLY (olddecl) = TREE_READONLY (newdecl);
+   }


Let's limit the const/volatile handling here to non-functions, and 
handle the const/noreturn attributes for functions in the later hunk 
along with nothrow/malloc/pure.



-  /* Merge parameter attributes. */
+  /* Merge or assign parameter attributes. */
   tree oldarg, newarg;
-  for (oldarg = DECL_ARGUMENTS(olddecl), 
-   newarg = DECL_ARGUMENTS(newdecl);

-   oldarg && newarg;
-   oldarg = DECL_CHAIN(oldarg), newarg = DECL_CHAIN(newarg)) {
-  DECL_ATTRIBUTES (newarg)
-  = (*targetm.merge_decl_attributes) (oldarg, newarg);
-  DECL_ATTRIBUTES (oldarg) = DECL_ATTRIBUTES (newarg);
-  }
-  
+  for (oldarg = DECL_ARGUMENTS (olddecl),

+newarg = DECL_ARGUMENTS (newdecl);
+  oldarg && newarg;
+  oldarg = DECL_CHAIN (oldarg), newarg = DECL_CHAIN (newarg))
+   {
+ DECL_ATTRIBUTES (newarg)
+   = (*targetm.merge_decl_attributes) (oldarg, newarg);
+ DECL_ATTRIBUTES (oldarg) = DECL_ATTRIBUTES (newarg);
+   }
+


I try to avoid reformatting code that I'm not actually changing, to 
avoid noise in e.g. git blame.


Jason

Go patch committed: Track //go:nointerface in export data

2018-02-09 Thread Ian Lance Taylor

In the Go frontend, the magic //go:nointerface comment, used for field
tracking, was only implemented for conversions to interface types in
the same package.  This patch records it in the export data, so that
it works as expected for types imported from a different package.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257527)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7e94bac5676afc8188677c98ecb263c78c1a7f8d
+89105404f94005ffa8e2b08df78015dc9ac91362
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 257527)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -5189,17 +5189,24 @@ Function::defer_stack(Location location)
 void
 Function::export_func(Export* exp, const std::string& name) const
 {
-  Function::export_func_with_type(exp, name, this->type_);
+  Function::export_func_with_type(exp, name, this->type_,
+ this->is_method() && this->nointerface());
 }
 
 // Export a function with a type.
 
 void
 Function::export_func_with_type(Export* exp, const std::string& name,
-   const Function_type* fntype)
+   const Function_type* fntype, bool nointerface)
 {
   exp->write_c_string("func ");
 
+  if (nointerface)
+{
+  go_assert(fntype->is_method());
+  exp->write_c_string("/*nointerface*/ ");
+}
+
   if (fntype->is_method())
 {
   exp->write_c_string("(");
@@ -5280,10 +5287,21 @@ Function::import_func(Import* imp, std::
  Typed_identifier** preceiver,
  Typed_identifier_list** pparameters,
  Typed_identifier_list** presults,
- bool* is_varargs)
+ bool* is_varargs,
+ bool* nointerface)
 {
   imp->require_c_string("func ");
 
+  *nointerface = false;
+  if (imp->match_c_string("/*"))
+{
+  imp->require_c_string("/*nointerface*/ ");
+  *nointerface = true;
+
+  // Only a method can be nointerface.
+  go_assert(imp->peek_char() == '(');
+}
+
   *preceiver = NULL;
   if (imp->peek_char() == '(')
 {
@@ -6213,6 +6231,32 @@ Bindings_snapshot::check_goto_defs(Locat
 
 // Class Function_declaration.
 
+// Whether this declares a method.
+
+bool
+Function_declaration::is_method() const
+{
+  return this->fntype_->is_method();
+}
+
+// Whether this method should not be included in the type descriptor.
+
+bool
+Function_declaration::nointerface() const
+{
+  go_assert(this->is_method());
+  return (this->pragmas_ & GOPRAGMA_NOINTERFACE) != 0;
+}
+
+// Record that this method should not be included in the type
+// descriptor.
+
+void
+Function_declaration::set_nointerface()
+{
+  this->pragmas_ |= GOPRAGMA_NOINTERFACE;
+}
+
 // Return the function descriptor.
 
 Expression*
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 257527)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -1476,13 +1476,14 @@ class Function
   // Export a function with a type.
   static void
   export_func_with_type(Export*, const std::string& name,
-   const Function_type*);
+   const Function_type*, bool nointerface);
 
   // Import a function.
   static void
   import_func(Import*, std::string* pname, Typed_identifier** receiver,
  Typed_identifier_list** pparameters,
- Typed_identifier_list** presults, bool* is_varargs);
+ Typed_identifier_list** presults, bool* is_varargs,
+ bool* nointerface);
 
  private:
   // Type for mapping from label names to Label objects.
@@ -1607,6 +1608,10 @@ class Function_declaration
   location() const
   { return this->location_; }
 
+  // Return whether this function declaration is a method.
+  bool
+  is_method() const;
+
   const std::string&
   asm_name() const
   { return this->asm_name_; }
@@ -1628,6 +1633,16 @@ class Function_declaration
 this->pragmas_ = pragmas;
   }
 
+  // Whether this method should not be included in the type
+  // descriptor.
+  bool
+  nointerface() const;
+
+  // Record that this method should not be included in the type
+  // descriptor.
+  void
+  set_nointerface();
+
   // Return an expression for the function descriptor, given the named
   // object for this function.  This may only be called for functions
   // without a closure.  This will be an immutable struct with one
@@ -1652,7 +1667,10 @@ class Function_declaration
   // Export a function declaration.
   void
   export_func(Export* exp, const std::string& name) const
-  { Function

Re: [PATCH][C++] Fix PR84281

2018-02-09 Thread Jason Merrill

OK.

On Fri, Feb 9, 2018 at 7:44 AM, Richard Biener  wrote:
>
> The following patch optimizes the equal element case of
> cxx_eval_vec_init_1 to use a RANGE_EXPR in the built CONSTRUCTOR
> instead of repeating the same element over and over.  This
> cuts down memory use of the invalid testcase in the PR from
> tens of GB to nothing and makes us rejct it instantanously
> instead of by going OOM.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Richard.
>
> 2018-02-09  Richard Biener  
>
> PR c++/84281
> * constexpr.c (cxx_eval_vec_init_1): Use a RANGE_EXPR to compact
> uniform constructors and delay allocating them fully.
>
> Index: gcc/cp/constexpr.c
> ===
> --- gcc/cp/constexpr.c  (revision 257491)
> +++ gcc/cp/constexpr.c  (working copy)
> @@ -2885,7 +2885,6 @@ cxx_eval_vec_init_1 (const constexpr_ctx
>unsigned HOST_WIDE_INT max = tree_to_uhwi (array_type_nelts_top (atype));
>verify_ctor_sanity (ctx, atype);
>vec **p = &CONSTRUCTOR_ELTS (ctx->ctor);
> -  vec_alloc (*p, max + 1);
>bool pre_init = false;
>unsigned HOST_WIDE_INT i;
>
> @@ -2978,13 +2977,14 @@ cxx_eval_vec_init_1 (const constexpr_ctx
> {
>   if (new_ctx.ctor != ctx->ctor)
> eltinit = new_ctx.ctor;
> - for (i = 1; i < max; ++i)
> -   {
> - idx = build_int_cst (size_type_node, i);
> - CONSTRUCTOR_APPEND_ELT (*p, idx, unshare_constructor (eltinit));
> -   }
> + tree range = build2 (RANGE_EXPR, size_type_node,
> +  build_int_cst (size_type_node, 1),
> +  build_int_cst (size_type_node, max - 1));
> + CONSTRUCTOR_APPEND_ELT (*p, range, unshare_constructor (eltinit));
>   break;
> }
> +  else
> +   vec_safe_reserve (*p, max + 1);
>  }
>
>if (!*non_constant_p)

C++ PATCH for c++/84296, ICE with qualified-id

2018-02-09 Thread Jason Merrill

In my patch to fix template/inherit4.C after 83714, I made all
SCOPE_REFs instantiation-dependent if the current class has dependent
bases.  This was excessive; only a reference to a non-static member of
an unknown base needs to be treated as instantiation-dependent.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 6d409191cfcd993863e791672ae6de78444796b1
Author: Jason Merrill 
Date:   Fri Feb 9 13:31:27 2018 -0500

PR c++/84296 - ICE with qualified-id in template.

PR c++/83714
* pt.c (unknown_base_ref_p): New.
(instantiation_dependent_scope_ref_p): Use it instead of
any_dependent_bases_p.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a9e47701527..9c57709e7a7 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -24012,6 +24012,30 @@ dependent_scope_p (tree scope)
  && !currently_open_class (scope));
 }
 
+/* T is a SCOPE_REF.  Return whether it represents a non-static member of
+   an unknown base of 'this' (and is therefore instantiation-dependent).  */
+
+static bool
+unknown_base_ref_p (tree t)
+{
+  if (!current_class_ptr)
+return false;
+
+  tree mem = TREE_OPERAND (t, 1);
+  if (shared_member_p (mem))
+return false;
+
+  tree cur = current_nonlambda_class_type ();
+  if (!any_dependent_bases_p (cur))
+return false;
+
+  tree ctx = TREE_OPERAND (t, 0);
+  if (DERIVED_FROM_P (ctx, cur))
+return false;
+
+  return true;
+}
+
 /* T is a SCOPE_REF; return whether we need to consider it
 instantiation-dependent so that we can check access at instantiation
 time even though we know which member it resolves to.  */
@@ -24021,9 +24045,7 @@ instantiation_dependent_scope_ref_p (tree t)
 {
   if (DECL_P (TREE_OPERAND (t, 1))
   && CLASS_TYPE_P (TREE_OPERAND (t, 0))
-  /* A dependent base could make a member inaccessible in the current
-class.  */
-  && !any_dependent_bases_p ()
+  && !unknown_base_ref_p (t)
   && accessible_in_template_p (TREE_OPERAND (t, 0),
   TREE_OPERAND (t, 1)))
 return false;
diff --git a/gcc/testsuite/g++.dg/template/scope5.C 
b/gcc/testsuite/g++.dg/template/scope5.C
new file mode 100644
index 000..629225cd556
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/scope5.C
@@ -0,0 +1,66 @@
+// PR c++/84296
+
+namespace b {}
+namespace c {
+using namespace b;
+}
+namespace b {
+template  struct e { static const int f = d; };
+}
+template  struct g;
+template 
+struct g : h::template ab {};
+struct k {
+  template  struct m { typedef typename g::n o; };
+};
+template  struct ac;
+struct r {
+  typedef ac p;
+};
+template  struct s : k {
+  template 
+  struct ab : q::template t::template ab {};
+};
+struct ad {
+  typedef int u;
+};
+template  struct ae;
+template  struct ah {
+  typedef ae ai;
+  typedef typename ai::template w::o n;
+};
+struct x {
+  template  struct ab : ah {};
+};
+struct y {
+  struct z {
+template  struct t : x {};
+  };
+  struct aj : s {};
+};
+template  struct ak {
+  typedef y::aj al;
+  typedef typename al::m::o o;
+};
+struct am {
+  enum { an };
+};
+template  struct ao {};
+template  struct ap : af::aq {};
+template <> struct ae {
+  template  struct w;
+  template  struct w {
+typedef typename as::p o;
+  };
+};
+enum { a = b::e<0>::f };
+template  class au;
+template  struct ac : ao { typedef c::e aq; };
+template  void ay(aw, i, ax) {
+  au::o>::f> > az();
+}
+void v() {
+  ad a;
+  void az();
+  ay(az, a, v);
+}

patch to fix PR57193

2018-02-09 Thread Vladimir Makarov


  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

  The patch introduces a new heuristic to change order of coloring.  
Allocnos conflicting with other allocnos preferring hard registers are 
colored first when other higher-level heuristics can not differ them.  
On x86_64 this new heuristic results in practically the same SPEC2000 
performance and code size.


  The patch was successfully bootstrapped and tested on x86-64.

  Committed as rev. 257537.


Index: ChangeLog
===
--- ChangeLog	(revision 257536)
+++ ChangeLog	(working copy)
@@ -1,3 +1,15 @@
+2018-02-09  Vladimir Makarov  
+
+	PR rtl-optimization/57193
+	* ira-color.c (struct allocno_color_data): Add member
+	conflict_allocno_hard_prefs.
+	(update_conflict_allocno_hard_prefs): New.
+	(bucket_allocno_compare_func): Add a preference based on
+	conflict_allocno_hard_prefs.
+	(push_allocno_to_stack): Update conflict_allocno_hard_prefs.
+	(color_allocnos): Remove a dead code.  Initiate
+	conflict_allocno_hard_prefs.  Call update_costs_from_prefs.
+
 2018-02-09  Jakub Jelinek  
 
 	PR target/84226
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 257536)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2018-02-09  Vladimir Makarov  
+
+	PR rtl-optimization/57193
+	* gcc.target/i386/57193.c: New.
+
 2018-02-09  Jakub Jelinek  
 
 	PR target/84226
Index: ira-color.c
===
--- ira-color.c	(revision 257157)
+++ ira-color.c	(working copy)
@@ -112,6 +112,9 @@ struct allocno_color_data
  available for the allocno allocation.  It is number of the
  profitable hard regs.  */
   int available_regs_num;
+  /* Sum of frequencies of hard register preferences of all
+ conflicting allocnos which are not the coloring stack yet.  */
+  int conflict_allocno_hard_prefs;
   /* Allocnos in a bucket (used in coloring) chained by the following
  two members.  */
   ira_allocno_t next_bucket_allocno;
@@ -1435,6 +1438,36 @@ update_costs_from_copies (ira_allocno_t
   update_costs_from_allocno (allocno, hard_regno, 1, decr_p, record_p);
 }
 
+/* Update conflict_allocno_hard_prefs of allocnos conflicting with
+   ALLOCNO.  */
+static void
+update_conflict_allocno_hard_prefs (ira_allocno_t allocno)
+{
+  int l, nr = ALLOCNO_NUM_OBJECTS (allocno);
+  
+  for (l = 0; l < nr; l++)
+{
+  ira_object_t conflict_obj, obj = ALLOCNO_OBJECT (allocno, l);
+  ira_object_conflict_iterator oci;
+  
+  FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
+	{
+	  ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
+	  allocno_color_data_t conflict_data = ALLOCNO_COLOR_DATA (conflict_a);
+	  ira_pref_t pref;
+
+	  if (!(hard_reg_set_intersect_p
+		(ALLOCNO_COLOR_DATA (allocno)->profitable_hard_regs,
+		 conflict_data->profitable_hard_regs)))
+	continue;
+	  for (pref = ALLOCNO_PREFS (allocno);
+	   pref != NULL;
+	   pref = pref->next_pref)
+	conflict_data->conflict_allocno_hard_prefs += pref->freq;
+	}
+}
+}
+
 /* Restore costs of allocnos connected to ALLOCNO by copies as it was
before updating costs of these allocnos from given allocno.  This
is a wise thing to do as if given allocno did not get an expected
@@ -2223,7 +2256,7 @@ bucket_allocno_compare_func (const void
 {
   ira_allocno_t a1 = *(const ira_allocno_t *) v1p;
   ira_allocno_t a2 = *(const ira_allocno_t *) v2p;
-  int diff, freq1, freq2, a1_num, a2_num;
+  int diff, freq1, freq2, a1_num, a2_num, pref1, pref2;
   ira_allocno_t t1 = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno;
   ira_allocno_t t2 = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno;
   int cl1 = ALLOCNO_CLASS (a1), cl2 = ALLOCNO_CLASS (a2);
@@ -2253,6 +2286,11 @@ bucket_allocno_compare_func (const void
   a2_num = ALLOCNO_COLOR_DATA (a2)->available_regs_num;
   if ((diff = a2_num - a1_num) != 0)
 return diff;
+  /* Push allocnos with minimal conflict_allocno_hard_prefs first.  */
+  pref1 = ALLOCNO_COLOR_DATA (a1)->conflict_allocno_hard_prefs;
+  pref2 = ALLOCNO_COLOR_DATA (a2)->conflict_allocno_hard_prefs;
+  if ((diff = pref1 - pref2) != 0)
+return diff;
   return ALLOCNO_NUM (a2) - ALLOCNO_NUM (a1);
 }
 
@@ -2339,7 +2377,8 @@ delete_allocno_from_bucket (ira_allocno_
 /* Put allocno A onto the coloring stack without removing it from its
bucket.  Pushing allocno to the coloring stack can result in moving
conflicting allocnos from the uncolorable bucket to the colorable
-   one.  */
+   one.  Update conflict_allocno_hard_prefs of the conflicting
+   allocnos which are not on stack yet.  */
 static void
 push_allocno_to_stack (ira_allocno_t a)
 {
@@ -2369,15 +2408,19 @@ push_allocno_to_stack (ira_allocno_t a)
   FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
 	{
 	  ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
-	  
+	  ira_pref_t p

Re: PR84239, Reimplement CET intrinsics for rdssp/incssp insn

2018-02-09 Thread Sandra Loosemore


On 02/09/2018 05:50 AM, Tsimbalist, Igor V wrote:

Introduce a couple of new CET intrinsics for reading and updating a shadow stack
pointer (_get_ssp and _inc_ssp), which are more user friendly. They replace the 
existing
_rdssp[d|q] and _incssp[d|q] instrinsics. The _get_ssp intrinsic has more 
deterministic
semantic: it returns a value of the shadow stack pointer if HW is CET capable 
and
0 otherwise.

Ok for trunk?


Just reviewing the documentation part:


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cb9df97..9f25dd9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12461,6 +12461,7 @@ instructions, but allow the compiler to schedule those 
calls.
 * TILEPro Built-in Functions::
 * x86 Built-in Functions::
 * x86 transactional memory intrinsics::
+* x86 control-flow protection intrinsics::
 @end menu
 
 @node AArch64 Built-in Functions

@@ -21772,13 +21773,17 @@ void __builtin_ia32_wrpkru (unsigned int)
 unsigned int __builtin_ia32_rdpkru ()
 @end smallexample
 
-The following built-in functions are available when @option{-mcet} is used.

-They are used to support Intel Control-flow Enforcment Technology (CET).
-Each built-in function generates the  machine instruction that is part of the
-function's name.
+The following built-in functions are available when @option{-mcet} or
+@option{-mshstk} option is used.  They support shadow stack
+machine instructions from Intel Control-flow Enforcment Technology (CET).


s/Enforcment/Enforcement/


+Each built-in function generates the  machine instruction that is part
+of the function's name.  These are the internal low level functions.


s/low level/low-level/


+Normally the functions in @ref{x86 control-flow protection intrinsics}
+should be used instead.
+
 @smallexample
-unsigned int __builtin_ia32_rdsspd (unsigned int)
-unsigned long long __builtin_ia32_rdsspq (unsigned long long)
+unsigned int __builtin_ia32_rdsspd (void)
+unsigned long long __builtin_ia32_rdsspq (void)
 void __builtin_ia32_incsspd (unsigned int)
 void __builtin_ia32_incsspq (unsigned long long)
 void __builtin_ia32_saveprevssp(void);
@@ -21885,6 +21890,51 @@ else
 Note that, in most cases, the transactional and non-transactional code
 must synchronize together to ensure consistency.
 
+@node x86 control-flow protection intrinsics

+@subsection x86 Control-Flow Protection Intrinsics
+
+@deftypefn {CET Function} {ret_type} _get_ssp (void)
+The @code{ret_type} is @code{unsigned long long} for x86-64 platform
+and @code{unsigned int} for x86 pltform.


I'd prefer the sentence about the return type be placed after the 
description of what the function does.  And please fix typos:

s/x86-64 platform/64-bit targets/
s/x86 pltform/32-bit targets/


+Get the current value of shadow stack pointer if shadow stack support
+from Intel CET is enabled in the HW or @code{0} otherwise.


s/HW/hardware,/


+@end deftypefn
+
+@deftypefn {CET Function} void _inc_ssp (unsigned int)
+Increment the current shadow stack pointer by the size specified by the
+function argument.  For security reason only unsigned byte value is used
+from the argument.  Therefore for the size greater than @code{255} the
+function should be called several times.


How about rephrasing the last two sentences:

The argument is masked to a byte value for security reasons, so to 
increment by more than 255 bytes you must call the function multiple times.



+@end deftypefn
+
+The shadow stack unwind code looks like:
+
+@smallexample
+#include 
+
+/* Unwind the shadow stack for EH.  */
+#define _Unwind_Frames_Extra(x)\
+  do   \
+@{ \
+  _Unwind_Word ssp = _get_ssp ();  \
+  if (ssp != 0)\
+   @{  \
+ _Unwind_Word tmp = (x);   \
+ while (tmp > 255)  \
+   @{  \
+ _inc_ssp (tmp);   \
+ tmp -= 255;   \
+   @}  \
+ _inc_ssp (tmp);   \
+   @}  \
+@} \
+while (0)
+@end smallexample


Tabs in Texinfo input don't work well.  Please use spaces to format code 
environments.



+
+@noindent
+This code runs unconditionally on all x86-64 processors and all x86
+processors that support multi-byte NOP instructions.


s/x86-64 and all x86/32-bit and 64-bit/


+
 @node Target Format Checks
 @section Format Checks Specific to Particular Target Machines
 


-Sandra

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Jakub Jelinek

On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
> >which indeed fixes the testcase and seems not to break asan.exp.
> 
> Huh. Need to double check why that makes sense ;) 

I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
is the second one, the first one is an integer argument with flags.
And ASAN_MARK, both poison and unpoison, works kind like a clobber on the
referenced variable, before unpoison it is generally inaccessible and after
poison too.

Jakub

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Jakub Jelinek

On Fri, Feb 09, 2018 at 05:40:09PM +0100, Richard Biener wrote:
> On February 9, 2018 5:08:24 PM GMT+01:00, Paolo Bonzini  
> wrote:
> >Hi all,
> >
> >in this PR, a dead reference to a function pointer cannot be optimized
> >out by the compiler because some ASAN_MARK UNPOISON calls, which are
> >placed before the store, cause the containing struct to escape.
> >(Without -fsanitize=address, the dead code is eliminated by the first
> >DSE pass).
> >
> >The fix, which works at least for this testcase, is to copy part of the
> >sanopt dead code elimination pass early, so that the compiler can see
> >fewer UNPOISON calls.  I am not sure this is general enough, due to
> >the very limited data flow analysis done by
> >sanitize_asan_mark_unpoison.
> >Another possibility which I considered but did not implement is to mark
> >the UNPOISON calls so that they do not cause the parameter to escape.
> 
> I'd do this, thus assign proper fnspec attributes to the asan functions. 

It already uses ".R.." "fn spec".

Jakub

Re: [PATCH] Avoid BSWAP with floating point modes on rs6000 (PR target/84226)

2018-02-09 Thread Segher Boessenkool

Hi Jakub,

On Fri, Feb 09, 2018 at 07:43:16AM +0100, Jakub Jelinek wrote:
> BSWAP is documented as:
> 
> @item (bswap:@var{m} @var{x})
> Represents the value @var{x} with the order of bytes reversed, carried out
> in mode @var{m}, which must be a fixed-point machine mode.
> The mode of @var{x} must be @var{m} or @code{VOIDmode}.
> 
> The fixed-point is something used widely in rtl.texi and is very confusing
> now that we have FIXED_POINT_TYPE floats, I assume it talks about integral
> modes or, because it is also used with vector modes, about INTEGRAL_MODE_P
> modes.  My understanding is that bswap on a vector integral mode is meant to
> be bswap of each element individually.

Right.  The documentation could use some improvement ;-)

> The rs6000 backend uses bswap not just on scalar integral modes and
> vector integral modes, but also on V4SF and V2DF, but ICEs in simplify-rtx.c
> where we don't expect bswap to be used on SF/DFmode (vector bswap is handled
> as bswap of each element).
> 
> The following patch adjusts the rs6000 backend to use well defined bswaps
> on corresponding integral modes instead (and also does what we've done in
> i386 backend years ago, avoid subreg on the lhs because it breaks combine
> attempts to optimize it).

> Bootstrapped/regtested on powerpc64{,le}-linux, on powerpc64-linux including
> -m64/-m32, ok for trunk?
> 
> 2018-02-09  Jakub Jelinek  
> 
>   PR target/84226
>   * config/rs6000/vsx.md (p9_xxbrq_v16qi): Change input operand
>   constraint from =wa to wa.  Avoid a subreg on the output operand,
>   instead use a pseudo and subreg it in a move.
>   (p9_xxbrd_): Changed to ...
>   (p9_xxbrd_v2di): ... this insn, without VSX_D iterator.
>   (p9_xxbrd_v2df): New expander.
>   (p9_xxbrw_): Changed to ...
>   (p9_xxbrw_v4si): ... this insn, without VSX_W iterator.
>   (p9_xxbrw_v4sf): New expander.
> 
>   * gcc.target/powerpc/pr84226.c: New test.

> +(define_expand "p9_xxbrd_v2df"
> +  [(use (match_operand:V2DF 0 "vsx_register_operand" "=wa"))
> +   (use (match_operand:V2DF 1 "vsx_register_operand" "wa"))]
> +  "TARGET_P9_VECTOR"
> +{
> +  rtx op0 = gen_reg_rtx (V2DImode);
> +  rtx op1 = gen_lowpart (V2DImode, operands[1]);
> +  emit_insn (gen_p9_xxbrd_v2di (op0, op1));
> +  emit_move_insn (operands[0], gen_lowpart (V2DFmode, op0));
> +  DONE;
> +})

Okay for trunk with or without such an improvement.  Thanks!


Segher

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Richard Biener

On February 9, 2018 6:23:37 PM GMT+01:00, Paolo Bonzini  wrote:
>On 09/02/2018 17:40, Richard Biener wrote:
>> On February 9, 2018 5:08:24 PM GMT+01:00, Paolo Bonzini
> wrote:
>>> Another possibility which I considered but did not implement is to
>mark
>>> the UNPOISON calls so that they do not cause the parameter to
>escape.
>> 
>> I'd do this, thus assign proper fnspec attributes to the asan
>functions. 
>
>Hmm, actually that might be as simple as fixing a typo:
>
>diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
>index 5970d0e..15d6151 100644
>--- a/gcc/internal-fn.def
>+++ b/gcc/internal-fn.def
>@@ -255,8 +255,8 @@ DEF_INTERNAL_FN (UBSAN_PTR, ECF_LEAF | ECF_NOTHROW,
>".R.")
> DEF_INTERNAL_FN (UBSAN_OBJECT_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> DEF_INTERNAL_FN (ABNORMAL_DISPATCHER, ECF_NORETURN, NULL)
>DEF_INTERNAL_FN (BUILTIN_EXPECT, ECF_CONST | ECF_LEAF | ECF_NOTHROW,
>NULL)
>-DEF_INTERNAL_FN (ASAN_CHECK, ECF_TM_PURE | ECF_LEAF | ECF_NOTHROW,
>".R...")
>-DEF_INTERNAL_FN (ASAN_MARK, ECF_LEAF | ECF_NOTHROW, ".R..")
>+DEF_INTERNAL_FN (ASAN_CHECK, ECF_TM_PURE | ECF_LEAF | ECF_NOTHROW,
>"..R..")
>+DEF_INTERNAL_FN (ASAN_MARK, ECF_LEAF | ECF_NOTHROW, "..W.")
>DEF_INTERNAL_FN (ASAN_POISON, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS,
>NULL)
>DEF_INTERNAL_FN (ASAN_POISON_USE, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS,
>NULL)
>DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW,
>NULL)
>
>which indeed fixes the testcase and seems not to break asan.exp.

Huh. Need to double check why that makes sense ;) 

>'W' is needed to avoid breaking the pr78541.c testcase, and I think it
>makes sense since ASAN_MARK is "writing" the state of the object
>(in the test case FRE moves a dereference across a poisoning).
>
>I'll look at it next week.  Someone maybe should take a look at ubsan
>fnspecs too.
>
>Paolo

Re: [PR c/84293] Unexpected strict-alias warning

2018-02-09 Thread Joseph Myers

On Fri, 9 Feb 2018, Nathan Sidwell wrote:

> Joseph, are the C bits ok?

Yes.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] cortex-a57-fma-steering.c fixes (PR target/84272)

2018-02-09 Thread Jakub Jelinek

Hi!

As mentioned in the PR, in the cortex-a57-fma-steering.c optimization pass
we may use fma_node/fma_root_node data after they were deleted and leak
leaf nodes because of an early continue; (the loop with
to_process.safe_push (*child_iter); in the body will do nothing if list
is empty, but freeing it is important).

The following patch fixes it, Thomas as the pass author looked at the patch
and provided feedback in the PR and Kyrill tested it on aarch64-linux.

Ok for trunk?

2018-02-09  Jakub Jelinek  

PR target/84272
* config/aarch64/cortex-a57-fma-steering.c (fma_forest::merge_forest):
Use ++iter rather than iter++ for std::list iterators.
(func_fma_steering::dfs): Likewise.  Don't delete nodes right away,
defer deleting them until all nodes in the forest are processed.  Do
free even leaf nodes.  Change to_process into auto_vec.

* g++.dg/opt/pr84272.C: New test.

--- gcc/config/aarch64/cortex-a57-fma-steering.c.jj 2018-02-09 
06:44:24.990811997 +0100
+++ gcc/config/aarch64/cortex-a57-fma-steering.c2018-02-09 
13:03:51.741447714 +0100
@@ -406,7 +406,7 @@ fma_forest::merge_forest (fma_forest *ot
 
   /* Update root nodes' pointer to forest.  */
   for (other_root_iter = other_roots->begin ();
-   other_root_iter != other_roots->end (); other_root_iter++)
+   other_root_iter != other_roots->end (); ++other_root_iter)
 (*other_root_iter)->set_forest (this);
 
   /* Remove other_forest from the list of forests and move its tree roots in
@@ -847,14 +847,13 @@ func_fma_steering::dfs (void (*process_f
void (*process_node) (fma_forest *, fma_node *),
bool free)
 {
-  vec to_process;
+  auto_vec to_process;
+  auto_vec to_free;
   std::list::iterator forest_iter;
 
-  to_process.create (0);
-
   /* For each forest.  */
   for (forest_iter = this->m_fma_forests.begin ();
-   forest_iter != this->m_fma_forests.end (); forest_iter++)
+   forest_iter != this->m_fma_forests.end (); ++forest_iter)
 {
   std::list::iterator root_iter;
 
@@ -863,7 +862,7 @@ func_fma_steering::dfs (void (*process_f
 
   /* For each tree root in this forest.  */
   for (root_iter = (*forest_iter)->get_roots ()->begin ();
-  root_iter != (*forest_iter)->get_roots ()->end (); root_iter++)
+  root_iter != (*forest_iter)->get_roots ()->end (); ++root_iter)
{
  if (process_root)
process_root (*forest_iter, *root_iter);
@@ -881,28 +880,30 @@ func_fma_steering::dfs (void (*process_f
  if (process_node)
process_node (*forest_iter, node);
 
- /* Absence of children might indicate an alternate root of a *chain*.
-It's ok to skip it here as the chain will be renamed when
-processing the canonical root for that chain.  */
- if (node->get_children ()->empty ())
-   continue;
-
  for (child_iter = node->get_children ()->begin ();
-  child_iter != node->get_children ()->end (); child_iter++)
+  child_iter != node->get_children ()->end (); ++child_iter)
to_process.safe_push (*child_iter);
+
+ /* Defer freeing so that the process_node callback can access the
+parent and children of the node being processed.  */
  if (free)
+   to_free.safe_push (node);
+   }
+
+  if (free)
+   {
+ delete *forest_iter;
+
+ while (!to_free.is_empty ())
{
+ fma_node *node = to_free.pop ();
  if (node->root_p ())
delete static_cast (node);
  else
delete node;
}
}
-  if (free)
-   delete *forest_iter;
 }
-
-  to_process.release ();
 }
 
 /* Build the dependency trees of FMUL and FMADD/FMSUB instructions.  */
--- gcc/testsuite/g++.dg/opt/pr84272.C.jj   2018-02-09 12:59:28.215636324 
+0100
+++ gcc/testsuite/g++.dg/opt/pr84272.C  2018-02-09 12:59:28.215636324 +0100
@@ -0,0 +1,23 @@
+// PR target/84272
+// { dg-do compile }
+// { dg-options "-O2" }
+// { dg-additional-options "-march=armv8-a -mtune=cortex-a57" { target 
aarch64-*-* } }
+
+struct A
+{
+  float b, c;
+  A ();
+  A (float, float, float);
+  float operator * (A)
+  {
+float d = b * b + c * c;
+return d;
+  }
+};
+
+void
+foo ()
+{
+  A g[1];
+  A h (0, 0, h * g[2]);
+}

Jakub

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Paolo Bonzini

On 09/02/2018 17:40, Richard Biener wrote:
> On February 9, 2018 5:08:24 PM GMT+01:00, Paolo Bonzini  
> wrote:
>> Another possibility which I considered but did not implement is to mark
>> the UNPOISON calls so that they do not cause the parameter to escape.
> 
> I'd do this, thus assign proper fnspec attributes to the asan functions. 

Hmm, actually that might be as simple as fixing a typo:

diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 5970d0e..15d6151 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -255,8 +255,8 @@ DEF_INTERNAL_FN (UBSAN_PTR, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_OBJECT_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ABNORMAL_DISPATCHER, ECF_NORETURN, NULL)
 DEF_INTERNAL_FN (BUILTIN_EXPECT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
-DEF_INTERNAL_FN (ASAN_CHECK, ECF_TM_PURE | ECF_LEAF | ECF_NOTHROW, ".R...")
-DEF_INTERNAL_FN (ASAN_MARK, ECF_LEAF | ECF_NOTHROW, ".R..")
+DEF_INTERNAL_FN (ASAN_CHECK, ECF_TM_PURE | ECF_LEAF | ECF_NOTHROW, "..R..")
+DEF_INTERNAL_FN (ASAN_MARK, ECF_LEAF | ECF_NOTHROW, "..W.")
 DEF_INTERNAL_FN (ASAN_POISON, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL)
 DEF_INTERNAL_FN (ASAN_POISON_USE, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL)
 DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)

which indeed fixes the testcase and seems not to break asan.exp.
'W' is needed to avoid breaking the pr78541.c testcase, and I think it
makes sense since ASAN_MARK is "writing" the state of the object
(in the test case FRE moves a dereference across a poisoning).

I'll look at it next week.  Someone maybe should take a look at ubsan
fnspecs too.

Paolo

Re: Please accept this commit for the trunk

2018-02-09 Thread Segher Boessenkool

Hi Mike,

On Thu, Feb 08, 2018 at 06:06:05PM -0800, Mike Stump wrote:
> On Feb 8, 2018, at 12:36 PM, Segher Boessenkool  
> wrote:
> > 
> > On Wed, Feb 07, 2018 at 03:52:27PM -0800, Mike Stump wrote:
> >> I dusted the pointed to patch off and check it in.  Let us know how it 
> >> goes.
> > 
> > I wanted to test this on the primary and secondary powerpc targets as
> > well, but okay.
> 
> I reviewed it, and it seemed to only trigger for darwin.  Certainly doesn't 
> hurt to run a regression run and ensure that is the case.

We'll find out if it regresses :-)

> >> Does this resolve all of PR84113?  If so, I can push the bug along.
> > 
> > It makes bootstrap work.  We don't know if it is correct otherwise.
> 
> So, would be nice if someone could run a regression test.  I'd do it by the 
> version just before the breakage, and then drop in the patch, and test again. 
>  This minimizes all the other changes.

Douglas has done a test (C family languages only it seems) at
https://gcc.gnu.org/ml/gcc-testresults/2018-02/msg00374.html
but it is hard to compare to previous results (not the same config,
long ago, etc.)

> >> What PR was the attachment url from?
> > 
> > It is not from a PR, and it has never been sent to gcc-patches; it is
> > from https://gcc.gnu.org/ml/gcc-testresults/2017-01/msg02971.html
> > (attachment #2).
> 
> Ah, that explains it.
> 
> Sounds like 1, 3 and 4 also likely need to go it to make things nice.  If 
> someone could regression test and let us know, that's likely the gating 
> factor.

Yeah, it's not easy to accept those patches without proper regression
testing, it's stage 4...  But the patches probably are good.


Segher

[Patch, Fortran] PR 84273: Reject allocatable passed-object dummy argument (proc_ptr_47.f90)

2018-02-09 Thread Janus Weil

Hi all,

the attached patch fixes some checking code for PASS arguments in
procedure-pointer components, which does not properly account for the
fact that the PASS argument needs to be polymorphic.

[The reason for this issue is probably that PPCs were mostly
implemented before polymorphism was available. The corresponding
pass-arg checks for TBPs are ok.]

The patch also fixes an invalid test case (which was detected thanks
to Neil Carlson). It regtests cleanly on x86_64-linux-gnu. Ok for
trunk?

Cheers,
Janus



2018-02-09  Janus Weil  

PR fortran/84273
* resolve.c (resolve_component): Fix checks of passed argument in
procedure-pointer components.


2018-02-09  Janus Weil  

PR fortran/84273
* gfortran.dg/proc_ptr_47.f90: Fix invalid test case.
* gfortran.dg/proc_ptr_comp_pass_4.f90: Fix and extend test case.
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c   (revision 257498)
+++ gcc/fortran/resolve.c   (working copy)
@@ -13703,8 +13703,8 @@ resolve_component (gfc_component *c, gfc_symbol *s
   return false;
 }
 
-  /* Check for C453.  */
-  if (me_arg->attr.dimension)
+  /* Check for F03:C453.  */
+  if (CLASS_DATA (me_arg)->attr.dimension)
 {
   gfc_error ("Argument %qs of %qs with PASS(%s) at %L "
  "must be scalar", me_arg->name, c->name, me_arg->name,
@@ -13713,7 +13713,7 @@ resolve_component (gfc_component *c, gfc_symbol *s
   return false;
 }
 
-  if (me_arg->attr.pointer)
+  if (CLASS_DATA (me_arg)->attr.class_pointer)
 {
   gfc_error ("Argument %qs of %qs with PASS(%s) at %L "
  "may not have the POINTER attribute", me_arg->name,
@@ -13722,7 +13722,7 @@ resolve_component (gfc_component *c, gfc_symbol *s
   return false;
 }
 
-  if (me_arg->attr.allocatable)
+  if (CLASS_DATA (me_arg)->attr.allocatable)
 {
   gfc_error ("Argument %qs of %qs with PASS(%s) at %L "
  "may not be ALLOCATABLE", me_arg->name, c->name,
Index: gcc/testsuite/gfortran.dg/proc_ptr_47.f90
===
--- gcc/testsuite/gfortran.dg/proc_ptr_47.f90   (revision 257498)
+++ gcc/testsuite/gfortran.dg/proc_ptr_47.f90   (working copy)
@@ -21,13 +21,9 @@
 
 contains
   function foo(A)
-class(AA), allocatable :: A
+class(AA) :: A
 type(AA) foo
 
-if (.not.allocated (A)) then
-  allocate (A, source = AA (2, foo))
-endif
-
 select type (A)
   type is (AA)
 foo = AA (3, foo)
Index: gcc/testsuite/gfortran.dg/proc_ptr_comp_pass_4.f90
===
--- gcc/testsuite/gfortran.dg/proc_ptr_comp_pass_4.f90  (revision 257498)
+++ gcc/testsuite/gfortran.dg/proc_ptr_comp_pass_4.f90  (working copy)
@@ -37,22 +37,23 @@ module m
 
  type :: t8
procedure(foo8), pass, pointer :: f8  ! { dg-error "must be of the derived 
type" }
+   procedure(foo9), pass, pointer :: f9  ! { dg-error "Non-polymorphic 
passed-object dummy argument" }
  end type
 
 contains
 
  subroutine foo1 (x1,y1)
-  type(t1) :: x1(:)
+  class(t1) :: x1(:)
   type(t1) :: y1
  end subroutine
 
  subroutine foo2 (x2,y2)
-  type(t2),pointer :: x2
+  class(t2),pointer :: x2
   type(t2) :: y2
  end subroutine
 
  subroutine foo3 (x3,y3)
-  type(t3),allocatable :: x3
+  class(t3),allocatable :: x3
   type(t3) :: y3
  end subroutine
 
@@ -69,4 +70,8 @@ contains
integer :: i
  end function
 
+ subroutine foo9(x)
+   type(t8) :: x
+ end subroutine
+
 end module m

Re: [PATCH, rs6000] Fix PR83926, ICE using __builtin_vsx_{div,udiv,mul}_2di builtins

2018-02-09 Thread Peter Bergner

On 2/9/18 10:51 AM, Segher Boessenkool wrote:
>> +/* { dg-final { scan-assembler-times {\mvclzb\M} 2 } } */
>> +/* { dg-final { scan-assembler-times {\mvclzb\M} 2 } } */

This was a typo in the original test case I didn't notice.  Looking closer,
they meant one of these to be vclzd.  I'll fix that, thanks.

Peter

Re: [PATCH] Adjust PR84278 testcase

2018-02-09 Thread Richard Sandiford

Richard Biener  writes:
> Got quoting of parens wrong.
>
> Tested on x86_64-unknown-linux-gnu, applied.
>
> Richard.
>
> 2018-02-09  Richard Biener  
>
>   PR tree-optimization/84278
>   * gcc.target/i386/pr84278.c: Adjust regex.
>
> Index: gcc/testsuite/gcc.target/i386/pr84278.c
> ===
> --- gcc/testsuite/gcc.target/i386/pr84278.c   (revision 257491)
> +++ gcc/testsuite/gcc.target/i386/pr84278.c   (working copy)
> @@ -15,4 +15,4 @@ void foo(void)
>  }
>  }
>  
> -/* { dg-final { scan-assembler-not "\(%.sp\)" } } */
> +/* { dg-final { scan-assembler-not "\\\(%.sp\\\)" } } */

FWIW:

/* { dg-final { scan-assembler-not {\(%.sp\)} } } */

works too and has saner escaping rules.

Richard

Re: [PATCH, rs6000] Fix PR83926, ICE using __builtin_vsx_{div,udiv,mul}_2di builtins

2018-02-09 Thread Segher Boessenkool

Hi!

On Thu, Feb 08, 2018 at 12:28:05PM -0600, Peter Bergner wrote:
> On 2/8/18 10:38 AM, Peter Bergner wrote:
> > * gcc.target/powerpc/builtins-1-be.c: Filter out gimple folding disabled
> > message.  Fix test for running in 32-bit mode.
> 
> As we talked about offline, here's a bigger change to builtins-1-be.c that
> cleans up the test a little more, since we generate xxlor in more cases
> than just the __builtin_vec_or() call, so this change adds the -dp option
> and we match the pattern name to verify we are getting as many as we expect
> from that and that alone.  This also splits the xxland and xxlandc into
> their own matches, which match the source test cases use of vec_and() and
> vec_andc().

> +/* { dg-final { scan-assembler-times {\mvclzb\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvclzb\M} 2 } } */

Duplicate line.

Fine otherwise, thanks!


Segher

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Richard Biener

On February 9, 2018 5:08:24 PM GMT+01:00, Paolo Bonzini  wrote:
>Hi all,
>
>in this PR, a dead reference to a function pointer cannot be optimized
>out by the compiler because some ASAN_MARK UNPOISON calls, which are
>placed before the store, cause the containing struct to escape.
>(Without -fsanitize=address, the dead code is eliminated by the first
>DSE pass).
>
>The fix, which works at least for this testcase, is to copy part of the
>sanopt dead code elimination pass early, so that the compiler can see
>fewer UNPOISON calls.  I am not sure this is general enough, due to
>the very limited data flow analysis done by
>sanitize_asan_mark_unpoison.
>Another possibility which I considered but did not implement is to mark
>the UNPOISON calls so that they do not cause the parameter to escape.

I'd do this, thus assign proper fnspec attributes to the asan functions. 

Richard. 

>Any suggestions on how to proceed here (or approval is welcome too :))?
>
>Paolo
>
>2018-02-09  Paolo Bonzini  
>
>   * passes.def: add pass_sandce before first DSE pass.
>   * sanopt.c (pass_data_sandce, pass_sandce, make_pass_sandce): New.
>   * tree-pass.h (make_pass_sandce): Declare it.
>
>testsuite:
>2018-02-09  Paolo Bonzini  
>
>   PR sanitizer/84307
>   * gcc.dg/asan/pr84307.c: New test.
>
>diff --git a/gcc/passes.def b/gcc/passes.def
>index 9802f08..180df50 100644
>--- a/gcc/passes.def
>+++ b/gcc/passes.def
>@@ -244,6 +244,7 @@ along with GCC; see the file COPYING3.  If not see
>only examines PHIs to discover const/copy propagation
>opportunities.  */
>   NEXT_PASS (pass_phi_only_cprop);
>+  NEXT_PASS (pass_sandce);
>   NEXT_PASS (pass_dse);
>   NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
>   NEXT_PASS (pass_dce);
>diff --git a/gcc/sanopt.c b/gcc/sanopt.c
>index cd94638..23b8e6e 100644
>--- a/gcc/sanopt.c
>+++ b/gcc/sanopt.c
>@@ -906,6 +906,32 @@ sanopt_optimize (function *fun, bool
>*contains_asan_mark)
> 
> namespace {
> 
>+const pass_data pass_data_sandce =
>+{
>+  GIMPLE_PASS, /* type */
>+  "sandce", /* name */
>+  OPTGROUP_NONE, /* optinfo_flags */
>+  TV_NONE, /* tv_id */
>+  ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
>+  0, /* properties_provided */
>+  0, /* properties_destroyed */
>+  0, /* todo_flags_start */
>+  TODO_update_ssa, /* todo_flags_finish */
>+};
>+
>+class pass_sandce : public gimple_opt_pass
>+{
>+public:
>+  pass_sandce (gcc::context *ctxt)
>+: gimple_opt_pass (pass_data_sandce, ctxt)
>+  {}
>+
>+  /* opt_pass methods: */
>+  virtual bool gate (function *) { return flag_sanitize &
>SANITIZE_ADDRESS; }
>+  virtual unsigned int execute (function *);
>+
>+}; // class pass_sanopt
>+
> const pass_data pass_data_sanopt =
> {
>   GIMPLE_PASS, /* type */
>@@ -1244,6 +1270,31 @@ sanitize_rewrite_addressable_params (function
>*fun)
> }
> 
> unsigned int
>+pass_sandce::execute (function *fun)
>+{
>+  basic_block bb;
>+  bool contains_asan_mark = false;
>+
>+  /* Try to remove redundant unpoisoning.  */
>+  gimple_stmt_iterator gsi;
>+  FOR_EACH_BB_FN (bb, fun)
>+for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>+  {
>+  gimple *stmt = gsi_stmt (gsi);
>+  if (gimple_call_internal_p (stmt, IFN_ASAN_MARK))
>+{
>+  contains_asan_mark = true;
>+  break;
>+}
>+  }
>+
>+  if (contains_asan_mark)
>+sanitize_asan_mark_unpoison ();
>+
>+  return 0;
>+}
>+
>+unsigned int
> pass_sanopt::execute (function *fun)
> {
>   basic_block bb;
>@@ -1367,6 +1418,12 @@ pass_sanopt::execute (function *fun)
> } // anon namespace
> 
> gimple_opt_pass *
>+make_pass_sandce (gcc::context *ctxt)
>+{
>+  return new pass_sandce (ctxt);
>+}
>+
>+gimple_opt_pass *
> make_pass_sanopt (gcc::context *ctxt)
> {
>   return new pass_sanopt (ctxt);
>diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
>index 93a6a99..d5eb055 100644
>--- a/gcc/tree-pass.h
>+++ b/gcc/tree-pass.h
>@@ -350,6 +350,7 @@ extern gimple_opt_pass *make_pass_tsan
>(gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_tsan_O0 (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_sancov (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_sancov_O0 (gcc::context *ctxt);
>+extern gimple_opt_pass *make_pass_sandce (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_cf (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_refactor_eh (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_eh (gcc::context *ctxt);
>diff --git a/gcc/testsuite/gcc.dg/asan/pr84307.c
>b/gcc/testsuite/gcc.dg/asan/pr84307.c
>new file mode 100644
>index 000..6e1a197
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/asan/pr84307.c
>@@ -0,0 +1,21 @@
>+/* PR middle-end/83185 */
>+/* { dg-do link } */
>+/* { dg-options "-O1" } */
>+
>+struct f {
>+  void (*func)(void);
>+};
>+
>+extern void link_error(void);
>+extern int printf(const char *f, ...);
>+
>+static inline struct f *gimme_null(struct f *result)
>+{
>+  return 0;
>

Re: [PATCH, rs6000] Fix PR83926, ICE using __builtin_vsx_{div,udiv,mul}_2di builtins

2018-02-09 Thread Peter Bergner

On 2/9/18 10:17 AM, Segher Boessenkool wrote:
> On Thu, Feb 08, 2018 at 10:38:09AM -0600, Peter Bergner wrote:
>> -/* { dg-final { scan-assembler-times "divd" 4 } } */
> 
>> +/* { dg-final { scan-assembler-times {\mdivd\M} 2 { target lp64 } } } */
> 
>> +/* { dg-final { scan-assembler-times {\mbl __divdi3\M} 2 { target ilp32 } } 
>> } */
> 
> Why does divd change from 4 to 2 times?  Is it changed to muls?

No, it's because "divd" was matching both divd and divdu insns, so it was
overcounting before.  Now that we use \m and \M word delimiters, we don't.
There are only 2 vec_div() and 2 vec_udiv() calls, so we should only expect
2 for both.

> Okay for trunk.  Thanks!

Thanks, committed.

Peter

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-09 Thread Martin Sebor


On 02/09/2018 03:54 AM, Jakub Jelinek wrote:

On Fri, Feb 09, 2018 at 11:40:29AM +0100, Richard Biener wrote:

I.e., having to track all pointers to d between the call to
strncpy and the assignment of the nul and make sure none of
them ends up used in a string function.  It didn't seem
the additional complexity would have been worth the effort
(or the likely false negatives).


Well, I'd just walk immediate uses of the VDEF of the
strncpy call, not of the pointer argument.  There will be exactly _one_ possible
store (gimple_vdef () is non-NULL) that you need to verify (with using
the current matching
logic).  But it'll skip non-store statements for you.


Well, it should also punt on the immediate uses of the VDEF that
have NULL gimple_vdef and the alias oracle says that might alias with that,
i.e. warn about say
  strncpy (p, ...);
  foo (p);
  p[whatever] = '\0';
where foo is pure, because it might read the unterminated string, but don't
warn about
  strncpy (p, ...);
  x = *q;
  p[whatever] = '\0';
if q[0] can't alias with p.

Or just warn if there are any immediate uses of the strncpy VDEF that have
gimple_vdef NULL, or non-NULL and aren't the zero store you are looking for.


Thank you both for the suggestions.  I'll try to remember
to revisit this in stage 1, if only to get more experience
with this sort of thing (gimple_vdef etc.)

Martin

[PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-09 Thread Carl Love

GCC maintainers:

The following patch contains fixes for the builtins4-runnable.c test in
32-bit mode for the current GCC 8 branch.  This is issue #290 in Bill
Schmidt's issues.  The int128 and uint128 variable types are not
supported in 32-bit mode.  The tests were moved to a new test file and
restricted to run only when int128 type is supported.

The patch passes the regression testing on Power 8 BE enabled for 64-
bit and 32-bit modes.  Additionally, the patch passes regression
testing on Power 8 LE.

Let me know if the patch looks OK or not. 

   Carl Love
 

gcc/testsuite/ChangeLog:

2018-02-09  Carl Love  

* gcc.target/powerpc/builtins-4-runnable.c (main): Move 
int128 and
uint128 tests to new testfile.
* gcc.target/powerpc/builtins-4-int128-runnable.c: New testfile
for
int128 and uint128 tests.
* gcc.target/powerpc/powerpc.exp: Add builtins-4-int128-
runnable.c to
list oftorture tests.
---
 .../powerpc/builtins-4-int128-runnable.c   | 108
+
 .../gcc.target/powerpc/builtins-4-runnable.c   |  84 ---
-
 gcc/testsuite/gcc.target/powerpc/powerpc.exp   |   1 +
 3 files changed, 109 insertions(+), 84 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-4-int128-
runnable.c

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-
runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-
runnable.c
new file mode 100644
index 000..a6973f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
@@ -0,0 +1,108 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-options "-maltivec -mvsx" } */
+
+#include 
+#include  // vector
+
+#ifdef DEBUG
+#include 
+#endif
+
+void abort (void);
+
+int main() {
+  int i;
+  __uint128_t data_u128[100];
+  __int128_t data_128[100];
+
+  vector __int128_t vec_128_expected1, vec_128_result1;
+  vector __uint128_t vec_u128_expected1, vec_u128_result1;
+  signed long long zero = (signed long long) 0;
+
+  for (i = 0; i < 100; i++)
+{
+  data_128[i] = i + 1280;
+  data_u128[i] = i + 1281;
+}
+
+  /* vec_xl() tests */
+
+  vec_128_expected1 = (vector __int128_t){1280};
+  vec_128_result1 = vec_xl (zero, data_128);
+
+  if (vec_128_expected1[0] != vec_128_result1[0])
+{
+#ifdef DEBUG
+   printf("Error: vec_xl(), vec_128_result1[0] = %lld %llu; ",
+      vec_128_result1[0] >> 64,
+      vec_128_result1[0] & (__int128_t)0x);
+   printf("vec_128_expected1[0] = %lld %llu\n",
+      vec_128_expected1[0] >> 64,
+      vec_128_expected1[0] & (__int128_t)0x);
+#else
+   abort ();
+#endif
+}
+
+  vec_u128_result1 = vec_xl (zero, data_u128);
+  vec_u128_expected1 = (vector __uint128_t){1281};
+  if (vec_u128_expected1[0] != vec_u128_result1[0])
+{
+#ifdef DEBUG
+   printf("Error: vec_xl(), vec_u128_result1[0] = %lld; ",
+      vec_u128_result1[0] >> 64,
+      vec_u128_result1[0] & (__int128_t)0x);
+   printf("vec_u128_expected1[0] = %lld\n",
+      vec_u128_expected1[0] >> 64,
+      vec_u128_expected1[0] &
(__int128_t)0x);
+#else
+   abort ();
+#endif
+}
+
+  /* vec_xl_be() tests */
+
+  vec_128_result1 = vec_xl_be (zero, data_128);
+#ifdef __BIG_ENDIAN__
+  vec_128_expected1 = (vector __int128_t){ (__int128_t)1280 };
+#else
+  vec_128_expected1 = (vector __int128_t){ (__int128_t)1280 };
+#endif
+
+  if (vec_128_expected1[0] != vec_128_result1[0])
+{
+#ifdef DEBUG
+   printf("Error: vec_xl_be(), vec_128_result1[0] = %llu %llu;",
+      vec_128_result1[0] >> 64,
+      vec_128_result1[0] & 0x);
+   printf(" vec_128_expected1[0] = %llu %llu\n",
+      vec_128_expected1[0] >> 64,
+      vec_128_expected1[0] & 0x);
+#else
+  abort ();
+#endif
+}
+
+#ifdef __BIG_ENDIAN__
+  vec_u128_expected1 = (vector __uint128_t){ (__uint128_t)1281 };
+#else
+  vec_u128_expected1 = (vector __uint128_t){ (__uint128_t)1281 };
+#endif
+
+  vec_u128_result1 = vec_xl_be (zero, data_u128);
+
+  if (vec_u128_expected1[0] != vec_u128_result1[0])
+{
+#ifdef DEBUG
+   printf("Error: vec_xl_be(), vec_u128_result1[0] = %llu %llu;",
+      vec_u128_result1[0] >> 64,
+      vec_u128_result1[0] & 0x);
+   printf(" vec_u128_expected1[0] = %llu %llu\n",
+      vec_u128_expected1[0] >> 64,
+      vec_u128_expected1[0] & 0x);
+#else
+  abort ();
+#endif
+}
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c
b/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c
index de9b916..35884b5 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c
@@ -27,9 +27,6

Re: [PATCH, rs6000] Fix PR83926, ICE using __builtin_vsx_{div,udiv,mul}_2di builtins

2018-02-09 Thread Segher Boessenkool

On Thu, Feb 08, 2018 at 10:38:09AM -0600, Peter Bergner wrote:
> On 2/6/18 10:36 AM, Peter Bergner wrote:
> > On 2/6/18 10:20 AM, David Edelsohn wrote:
> >> Do the gen_XXXdi3 calls work if you use SDI iterator instead of GPR
> >> iterator, as Segher suggested?
> > 
> > Well it works _if_ we use the first patch that changes the gen_*
> > patterns.  If we go this route, I agree we should use the SDI
> > iterator instead of GPR.
> 
> Actually, my bad.  While bootstrapping this on a BE system, we get an
> error when we attempt a 64-bit multiply in 32-bit mode.  In this case,
> the gen_muldi3() pattern calls expand_mult(DImode, ...) and the automatic
> expand machinery notices the gen_muldi3() now allows DImode in the
> !TARGET_POWERPC64 case and then calls gen_muldi3() to emit the multiply
> and we go into infinite recursion.  We don't have that problem in the
> div/udiv case, because we call out to the lib routines, so no recursion.
> Given this, I think we should probably go with the patch that modifies
> vsx.md and guards the calls to gen_{div,udiv,mul}di3() with a TARGET_POWERPC64
> test.

Okay, let's go with this, at least for now.  Thanks for investigating.

> -/* { dg-final { scan-assembler-times "divd" 4 } } */

> +/* { dg-final { scan-assembler-times {\mdivd\M} 2 { target lp64 } } } */

> +/* { dg-final { scan-assembler-times {\mbl __divdi3\M} 2 { target ilp32 } } 
> } */

Why does divd change from 4 to 2 times?  Is it changed to muls?

Okay for trunk.  Thanks!


Segher

Re: [PATCH, rs6000] PR84220 fix altivec_vec_sld and vec_sldw intrinsic definitions

2018-02-09 Thread Will Schmidt

On Thu, 2018-02-08 at 17:48 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 07, 2018 at 09:14:59AM -0600, Will Schmidt wrote:
> >   Our VEC_SLD definitions were mistakenly allowing the third argument to be
> > of an invalid type, triggering an ICE (on invalid code) later in the build
> > process.  This fixes those definitions.  The nearby VEC_SLDW definitions 
> > have
> > the same issue, those have been fixed as part of this patch too.
> > Testcases have been added to ensure we generate the 'invalid intrinsic'
> > message as is appropriate, instead of ICEing.
> > Giving proper credit, this was found by Peter Bergner while working a
> > different issue. :-)
> > 
> > Sniff-tests passed on P8.  Doing larger reg-test across power systems now.
> > OK for trunk?
> > And,.. do we want this one backported too?
> 
> > diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> > index a68be51..26f9990 100644
> > --- a/gcc/config/rs6000/rs6000-c.c
> > +++ b/gcc/config/rs6000/rs6000-c.c
> > @@ -3654,39 +3654,39 @@ const struct altivec_builtin_types 
> > altivec_overloaded_builtins[] = {
> >{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> >  RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
> > RS6000_BTI_bool_V16QI },
> >{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> >  RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
> > RS6000_BTI_unsigned_V16QI },
> >{ ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SF,
> > -RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 
> > RS6000_BTI_NOT_OPAQUE },
> > +RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_INTSI },
> 
> It isn't clear to me what RS6000_BTI_NOT_OPAQUE means...  rs6000-c.c says:
> 
> /* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
>the opX fields.  */
> 
> (whatever that means!), and the following code seems to allow anything in
> such args?  If you understand it, please update some comments somewhere?

I do not...  only got as far as figuring out it wasn't right for
vec_sld*().
I'll poke around a bit to see what I can figure out.  May need to punt
to (???) to understand the intent.Mike?/Peter?

A bit more context around that usage is:
[rs6000-c.c: altivec_resolve_overloaded_builtin() ]
...
/* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
   the opX fields.  */
for (; desc->code == fcode; desc++)
  {
if ((desc->op1 == RS6000_BTI_NOT_OPAQUE
 || rs6000_builtin_type_compatible (types[0], desc->op1))
...
with that check repeated against types[1], desc->op2,.. etc.

> >{ VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF,

> XXPERMDI is the only other builtin that uses NOT_OPAQUE, does that suffer
> from the same problem?  If so, you can completely delete NOT_OPAQUE it
> seems?

Not sure.

> So what is/was it for, that is what I wonder.
> 
> Your patch looks fine if you can clear that up :-)
Heh, Ok.

> 
> 
> Segher
>

[PATCH] Character length cleanup for Coarray Fortran library

2018-02-09 Thread Janne Blomqvist

Following the change to use size_t for Fortran character lengths (PR
78534), this patch modifies the Coarray ABI in a similar way. The
single-image implementation that is included in libgfortran is
updated, but this needs corresponding work in the OpenCoarray library
as well for multi-image support. Damian, can you look into that?

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

gcc/fortran/ChangeLog:

2018-02-09  Janne Blomqvist  

* gfortran.texi: Update Coarray API description.
* trans-decl.c (gfc_build_builtin_function_decls): Use size_t for
character lengths, int for exit codes.
(generate_coarray_sym_init): Use size_t for character length.
* trans-intrinsic.c (conv_co_collective): Likewise.
* trans-stmt.c (gfc_trans_lock_unlock): Likewise.
(gfc_trans_event_post_wait): Likewise.
(gfc_trans_sync): Likewise.

libgfortran/ChangeLog:

2018-02-09  Janne Blomqvist  

* caf/libcaf.h: Remove stdint.h include.
(_gfortran_caf_register): Use size_t for character length.
(_gfortran_caf_deregister): Likewise.
(_gfortran_caf_sync_all): Likewise.
(_gfortran_caf_sync_memory): Likewise.
(_gfortran_caf_sync_images): Likewise.
(_gfortran_caf_stop_numeric): Use int for exit code.
(_gfortran_caf_stop_str): Use size_t for character length.
(_gfortran_caf_error_stop_str): Likewise.
(_gfortran_caf_error_stop): Use int for exit code.
(_gfortran_caf_co_broadcast): Use size_t for character length.
(_gfortran_caf_co_sum): Likewise.
(_gfortran_caf_co_min): Likewise.
(_gfortran_caf_co_max): Likewise.
(_gfortran_caf_co_reduce): Likewise.
(_gfortran_caf_lock): Likewise.
(_gfortran_caf_unlock): Likewise.
(_gfortran_caf_event_post): Likewise.
(_gfortran_caf_event_wait): Likewise.
* caf/mpi.c (_gfortran_caf_register): Update implementation to
match prototype.
(_gfortran_caf_deregister): Likewise.
(_gfortran_caf_sync_all): Likewise.
(_gfortran_caf_sync_images): Likewise.
(_gfortran_caf_error_stop_str): Likewise.
(_gfortran_caf_error_stop): Likewise.
* caf/single.c (caf_internal_error): Likewise.
(_gfortran_caf_register): Likewise.
(_gfortran_caf_deregister): Likewise.
(_gfortran_caf_sync_all): Likewise.
(_gfortran_caf_sync_memory): Likewise.
(_gfortran_caf_sync_images): Likewise.
(_gfortran_caf_stop_numeric): Likewise.
(_gfortran_caf_stop_str): Likewise.
(_gfortran_caf_error_stop_str): Likewise.
(_gfortran_caf_error_stop): Likewise.
(_gfortran_caf_co_broadcast): Likewise.
(_gfortran_caf_co_sum): Likewise.
(_gfortran_caf_co_min): Likewise.
(_gfortran_caf_co_max): Likewise.
(_gfortran_caf_co_reduce): Likewise.
(_gfortran_caf_event_post): Likewise.
(_gfortran_caf_event_wait): Likewise.
(_gfortran_caf_lock): Likewise.
(_gfortran_caf_unlock): Likewise.
---
 gcc/fortran/gfortran.texi | 32 -
 gcc/fortran/trans-decl.c  | 38 +++---
 gcc/fortran/trans-intrinsic.c |  4 ++--
 gcc/fortran/trans-stmt.c  | 12 +-
 libgfortran/caf/libcaf.h  | 37 ++---
 libgfortran/caf/mpi.c | 28 +++---
 libgfortran/caf/single.c  | 55 ++-
 7 files changed, 103 insertions(+), 103 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 1124669..7c09bc4 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -4458,7 +4458,7 @@ NULL.
 
 @item @emph{Syntax}:
 @code{void caf_register (size_t size, caf_register_t type, caf_token_t *token,
-gfc_descriptor_t *desc, int *stat, char *errmsg, int errmsg_len)}
+gfc_descriptor_t *desc, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4510,7 +4510,7 @@ during a call to @code{_gfortran_caf_register}.
 
 @item @emph{Syntax}:
 @code{void caf_deregister (caf_token_t *token, caf_deregister_t type,
-int *stat, char *errmsg, int errmsg_len)}
+int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4936,7 +4936,7 @@ which has already been locked by the same image is an 
error.
 
 @item @emph{Syntax}:
 @code{void _gfortran_caf_lock (caf_token_t token, size_t index, int 
image_index,
-int *aquired_lock, int *stat, char *errmsg, int errmsg_len)}
+int *aquired_lock, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4971,7 +4971,7 @@ which is unlocked or has been locked by a different image 
is an error.
 
 @item @emph{Syntax}:
 @code{void _gfortran_caf_unlock (caf_token_t token, size_t index, int 
image_index,
-int *stat, char *errmsg, int

[PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-09 Thread Paolo Bonzini

Hi all,

in this PR, a dead reference to a function pointer cannot be optimized
out by the compiler because some ASAN_MARK UNPOISON calls, which are
placed before the store, cause the containing struct to escape.
(Without -fsanitize=address, the dead code is eliminated by the first
DSE pass).

The fix, which works at least for this testcase, is to copy part of the
sanopt dead code elimination pass early, so that the compiler can see
fewer UNPOISON calls.  I am not sure this is general enough, due to
the very limited data flow analysis done by sanitize_asan_mark_unpoison.
Another possibility which I considered but did not implement is to mark
the UNPOISON calls so that they do not cause the parameter to escape.

Any suggestions on how to proceed here (or approval is welcome too :))?

Paolo

2018-02-09  Paolo Bonzini  

* passes.def: add pass_sandce before first DSE pass.
* sanopt.c (pass_data_sandce, pass_sandce, make_pass_sandce): New.
* tree-pass.h (make_pass_sandce): Declare it.

testsuite:
2018-02-09  Paolo Bonzini  

PR sanitizer/84307
* gcc.dg/asan/pr84307.c: New test.

diff --git a/gcc/passes.def b/gcc/passes.def
index 9802f08..180df50 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -244,6 +244,7 @@ along with GCC; see the file COPYING3.  If not see
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
+  NEXT_PASS (pass_sandce);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
   NEXT_PASS (pass_dce);
diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index cd94638..23b8e6e 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -906,6 +906,32 @@ sanopt_optimize (function *fun, bool *contains_asan_mark)
 
 namespace {
 
+const pass_data pass_data_sandce =
+{
+  GIMPLE_PASS, /* type */
+  "sandce", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_update_ssa, /* todo_flags_finish */
+};
+
+class pass_sandce : public gimple_opt_pass
+{
+public:
+  pass_sandce (gcc::context *ctxt)
+: gimple_opt_pass (pass_data_sandce, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *) { return flag_sanitize & SANITIZE_ADDRESS; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_sanopt
+
 const pass_data pass_data_sanopt =
 {
   GIMPLE_PASS, /* type */
@@ -1244,6 +1270,31 @@ sanitize_rewrite_addressable_params (function *fun)
 }
 
 unsigned int
+pass_sandce::execute (function *fun)
+{
+  basic_block bb;
+  bool contains_asan_mark = false;
+
+  /* Try to remove redundant unpoisoning.  */
+  gimple_stmt_iterator gsi;
+  FOR_EACH_BB_FN (bb, fun)
+for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+  {
+   gimple *stmt = gsi_stmt (gsi);
+   if (gimple_call_internal_p (stmt, IFN_ASAN_MARK))
+ {
+   contains_asan_mark = true;
+   break;
+ }
+  }
+
+  if (contains_asan_mark)
+sanitize_asan_mark_unpoison ();
+
+  return 0;
+}
+
+unsigned int
 pass_sanopt::execute (function *fun)
 {
   basic_block bb;
@@ -1367,6 +1418,12 @@ pass_sanopt::execute (function *fun)
 } // anon namespace
 
 gimple_opt_pass *
+make_pass_sandce (gcc::context *ctxt)
+{
+  return new pass_sandce (ctxt);
+}
+
+gimple_opt_pass *
 make_pass_sanopt (gcc::context *ctxt)
 {
   return new pass_sanopt (ctxt);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 93a6a99..d5eb055 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -350,6 +350,7 @@ extern gimple_opt_pass *make_pass_tsan (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tsan_O0 (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sancov (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sancov_O0 (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_sandce (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_lower_cf (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_refactor_eh (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_lower_eh (gcc::context *ctxt);
diff --git a/gcc/testsuite/gcc.dg/asan/pr84307.c 
b/gcc/testsuite/gcc.dg/asan/pr84307.c
new file mode 100644
index 000..6e1a197
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr84307.c
@@ -0,0 +1,21 @@
+/* PR middle-end/83185 */
+/* { dg-do link } */
+/* { dg-options "-O1" } */
+
+struct f {
+  void (*func)(void);
+};
+
+extern void link_error(void);
+extern int printf(const char *f, ...);
+
+static inline struct f *gimme_null(struct f *result)
+{
+  return 0;
+}
+
+int main(int argc, char **argv)
+{
+  struct f *x = gimme_null(&(struct f) { .func = link_error });
+  printf("%p", x);
+}

Re: [PATCH, rs6000] Update vsx-vector-6-le.c tests for p9 target

2018-02-09 Thread Will Schmidt

On Wed, 2018-02-07 at 12:28 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 07, 2018 at 11:16:12AM -0600, Will Schmidt wrote:
> > Noted during review of test results on P9.   Due to changes and 
> > improvements,
> > our codegen is different for this test on power9.
> > Modified the existing test to target P8, and added a P9 variant with updated
> > counts.
> 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c 
> > b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > index ddb0089..7fe691b 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > @@ -1,11 +1,11 @@
> >  /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> >  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
> >  /* { dg-require-effective-target powerpc_vsx_ok } */
> > -/* { dg-options "-mvsx -O2" } */
> > +/* { dg-options "-mvsx -O2 -mcpu=power8" } */
> 
> Why not -mcpu=power7?  And you'll need
> 
> /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power7" } } */
> 
> You could also do instead
> 
> /* { dg-skip-if "this is not for p9" { powerpc_p9vector_ok } } */
> 
> or something like that; a bit neater.

I did try that, it had the effect of disabling the test if the system
was able to generate P9 code, which included my p8 environment. :-)
Ended up specifying "do not overrides" for both tests.  (v2) has been
posted.

> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mvsx -O2 -mcpu=power9" } */
> 
> This needs the "do not override -mcpu" thing as well.
> 
> 
> Segher
>

[PATCH, rs6000] (v2) Update vsx-vector-6-le.c tests

2018-02-09 Thread Will Schmidt

Hi,
Noted during review of test results on P9.   Due to changes and improvements,
our codegen is different for this test on power9.
Modified the existing test to target P8 and added a P9 variant with updated
counts.

(v2) Updated/added dg-requires and dg-skip-if stanzas pre review and re-check
of the test results.

Sniff-tested good, now runs clean on P9 where it did not before. Runs clean on
P8LE, P8BE, P7.
OK for trunk?

Thanks,
-Will

[testsuite]

2018-02-09  Will Schmidt  

* gcc.target/powerpc/vsx-vector-6-le.c:  Update dg-* stanzas.
* gcc.target/powerpc/vsx-vector-6-le.p9.c:  New.

diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
index ddb0089..c3f795c 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
@@ -1,11 +1,12 @@
 /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-mvsx -O2" } */
+/* { dg-options "-mvsx -O2 -mcpu=power8" } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
 
-/* Expected instruction counts for Little Endian */
+/* Expected instruction counts for Little Endian targeting Power8. */
 
 /* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
 /* { dg-final { scan-assembler-times "xvadddp" 1 } } */
 /* { dg-final { scan-assembler-times "xxlnor" 8 } } */
 /* { dg-final { scan-assembler-times "xxlor" 30 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
new file mode 100644
index 000..290d4b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mvsx -O2 -mcpu=power9" } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+
+/* Expected instruction counts for Little Endian targeting Power9. */
+
+/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvadddp" 1 } } */
+/* { dg-final { scan-assembler-times "xxlnor" 7 } } */
+/* { dg-final { scan-assembler-times "xxlor" 20 } } */
+/* { dg-final { scan-assembler-times "xvcmpeqdp" 5 } } */
+/* { dg-final { scan-assembler-times "xvcmpgtdp" 8 } } */
+/* { dg-final { scan-assembler-times "xvcmpgedp" 8 } } */
+/* { dg-final { scan-assembler-times "xvrdpim" 1 } } */
+/* { dg-final { scan-assembler-times "xvmaddadp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmsubadp" 1 } } */
+/* { dg-final { scan-assembler-times "xvsubdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmaxdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmindp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmuldp" 1 } } */
+/* { dg-final { scan-assembler-times "vperm" 1 } } */
+/* { dg-final { scan-assembler-times "xvrdpic" 1 } } */
+/* { dg-final { scan-assembler-times "xvsqrtdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvrdpiz" 1 } } */
+/* { dg-final { scan-assembler-times "xvmsubasp" 1 } } */
+/* { dg-final { scan-assembler-times "xvnmaddasp" 1 } } */
+/* { dg-final { scan-assembler-times "vmsumshs" 1 } } */
+/* { dg-final { scan-assembler-times "xxland" 13 } } */
+
+/* Source code for the test in vsx-vector-6.h */
+#include "vsx-vector-6.h"

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Jeff Law

On 02/09/2018 03:34 AM, Alexandre Oliva wrote:
> On Feb  9, 2018, Jeff Law  wrote:
> 
>> On 02/08/2018 08:53 PM, Alan Modra wrote:
>>> On Fri, Feb 09, 2018 at 01:21:27AM -0200, Alexandre Oliva wrote:
 Here's what I checked in, right after the LVU patch.

 [IEPM] Introduce inline entry point markers
>>>
>>> One of these two patches breaks ppc64le bootstrap with the assembler
>>> complaining "Error: view number mismatch" when compiling
>>> libdecnumber.
>>>
>> I've just passed along a similar failure (.i, .s and command line
>> options) to Alex for ppc64 (be) building glibc.
> 
> This fixes at least the testcase Jeff provided me with.  I'm going ahead
> and checking it in as obvious.  I suppose we might need more of these,
> on this and other ports, if they have been sloppy about zero-length
> pseudo insns :-(
> 
> Would you guys please let me know whether you still see a problem, if
> you get a chance to respin?  I was just about to crash in bed when I saw
> your email.
> 
> When I get back up, I'll build the latest binutils release on ppc64,
> ppc64el and aarch64, and then bootstrap gcc with it.  I should have done
> that when I broadened my testing of the SFN+LVU+IEPM patchset to those 
> platforms, but I didn't realize I was failing to test them with an
> assembler with view support, doh!  Sorry about that.
No need for the binutils+gcc bootstrapping test when you get up.  Mine's
already run.  ppc, ppc64, ppc64le, aarch64 all covered.

My tester does have half-dozen or so other failures overnight, but I
haven't looked at them yet.  If any look similar I'll first check if
we're dealing with a zero length pseudo insn (I wouldn't be surprised if
blockage insns are consistently wrong on that).


jeff

gotools patch committed: Increase test timeout

2018-02-09 Thread Ian Lance Taylor

I'm finding that on my laptop the cmd/go tests sometimes time out when
the laptop is heavily loaded because I'm running all the tests in
parallel.  The cmd/go tests are really a whole testsuite, not a single
test.  So bumping the timeout.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian


2018-02-09  Ian Lance Taylor  

* Makefile.am (GOTOOLS_TEST_TIMEOUT): Double value.
Index: Makefile.am
===
--- Makefile.am (revision 257494)
+++ Makefile.am (working copy)
@@ -162,7 +162,7 @@ uninstall-local:
 GOTESTFLAGS =
 
 # Number of seconds before tests time out.
-GOTOOLS_TEST_TIMEOUT = 240
+GOTOOLS_TEST_TIMEOUT = 480
 
 # Run tests using the go tool, and frob the output to look like that
 # generated by DejaGNU.  The main output of this is two files:
Index: Makefile.in
===
--- Makefile.in (revision 257494)
+++ Makefile.in (working copy)
@@ -348,7 +348,7 @@ MOSTLYCLEANFILES = \
 @NATIVE_TRUE@GOTESTFLAGS = 
 
 # Number of seconds before tests time out.
-@NATIVE_TRUE@GOTOOLS_TEST_TIMEOUT = 240
+@NATIVE_TRUE@GOTOOLS_TEST_TIMEOUT = 480
 
 # CHECK_ENV sets up the environment to run the newly built go tool.
 # If you change this, change ECHO_ENV, below.
@@ -635,8 +635,8 @@ distclean-generic:
 maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
-@NATIVE_FALSE@install-exec-local:
 @NATIVE_FALSE@uninstall-local:
+@NATIVE_FALSE@install-exec-local:
 clean: clean-am
 
 clean-am: clean-binPROGRAMS clean-generic clean-noinstPROGRAMS \

Re: [PATCH][i386][3/3] PR target/84164: Make *cmpqi_ext_ patterns accept more zero_extract modes

2018-02-09 Thread Kyrill Tkachov


Hi Uros,

On 08/02/18 22:54, Uros Bizjak wrote:

On Thu, Feb 8, 2018 at 6:11 PM, Kyrill  Tkachov
 wrote:

Hi all,

This patch fixes some fallout in the i386 testsuite that occurs after the
simplification in patch [1/3] [1].
The gcc.target/i386/extract-2.c FAILs because it expects to match:
(set (reg:CC 17 flags)
 (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98)
 (const_int 8 [0x8])
 (const_int 8 [0x8])) 0)
 (const_int 4 [0x4])))

which is the *cmpqi_ext_2 pattern in i386.md but with the new simplification
the combine/simplify-rtx
machinery produces:
(set (reg:CC 17 flags)
 (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98)
 (const_int 8 [0x8])
 (const_int 8 [0x8])) 0)
 (const_int 4 [0x4])))

Notice that the zero_extract now has HImode like the register source rather
than SImode.
The existing *cmpqi_ext_ patterns however explicitly demand an SImode on
the zero_extract.
I'm not overly familiar with the i386 port but I think that's too
restrictive.
The RTL documentation says:
For (zero_extract:m loc size pos) "The mode m is the same as the mode that
would be used for loc if it were a register."
I'm not sure if that means that the mode of the zero_extract and the source
register must always match (as is the
case after patch [1/3]) but in any case it shouldn't matter semantically
since we're taking a QImode subreg of the whole
thing anyway.

So the proposed solution in this patch is to allow HI, SI and DImode
zero_extracts in these patterns as these are the
modes that the ext_register_operand predicate accepts, so that the patterns
can match the new form above.

With this patch the aforementioned test passes again and bootstrap and
testing on x86_64-unknown-linux-gnu shows
no regressions.

Is this ok for trunk if the first patch is accepted?

Huh, there are many other zero-extract patterns besides cmpqi_ext_*
with QImode subreg of SImode zero_extract in i386.md, used to access
high QImode register of HImode pair. A quick grep shows these that
have _ext_ in their name:

(define_insn "*cmpqi_ext_1"
(define_insn "*cmpqi_ext_2"
(define_expand "cmpqi_ext_3"
(define_insn "*cmpqi_ext_3"
(define_insn "*cmpqi_ext_4"
(define_insn "addqi_ext_1"
(define_insn "*addqi_ext_2"
(define_expand "testqi_ext_1_ccno"
(define_insn "*testqi_ext_1"
(define_insn "*testqi_ext_2"
(define_insn_and_split "*testqi_ext_3"
(define_insn "andqi_ext_1"
(define_insn "*andqi_ext_1_cc"
(define_insn "*andqi_ext_2"
(define_insn "*qi_ext_1"
(define_insn "*qi_ext_2"
(define_expand "xorqi_ext_1_cc"
(define_insn "*xorqi_ext_1_cc"

There are also relevant splitters and peephole2 patterns.


I see. Another approach I've looked at is removing the mode specifier from
the zero_extract in these patterns. This means that they can be of any mode
so they will match all of these modes without creating new patterns through
iterators. That also works for the testcase and passes bootstrap and testing
however there is the snag that the define_insns that don't start with a "*"
are used to generate RTL through the gen_* mechanism and in that context
the absence of a mode on the zero_extract would mean a VOIDmode zero_extract
would be created, which I'm fairly sure is not good. So in my experiments I left
those patterns alone (with an explicit SI on the zero_extract).


IIRC, SImode zero_extract was enough to catch all high-register uses.
There will be a pattern explosion if we want to handle all other
integer modes here. However, I'm not a RTL expert, so someone will
have to say what is the correct RTX form here.


Jeff, Richard, could you please give us some guidance on this issue?
Sorry for the trouble.

Thanks,
Kyrill


Uros.

Re: PR84300, ICE in dwarf2cfi on ppc64le

2018-02-09 Thread Segher Boessenkool

On Fri, Feb 09, 2018 at 04:12:47PM +1030, Alan Modra wrote:
> This PR is one of those with a really obvious cause, and fix.  There's
> nothing in the unspec rtl to say the insn needs lr!
> Bootstrapped and regression tested powerpc64le-linux.  OK?
> 
>   PR target/84300
> gcc/
>   * config/rs6000/rs6000.md (split_stack_return): Use LR.
> gcc/testsuite/
>   * gcc.dg/pr84300.c: New.
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 33f0d95..287461f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -13359,7 +13359,8 @@ (define_insn "load_split_stack_limit_si"
>  ;; Use r0 to stop regrename twiddling with lr restore insns emitted
>  ;; after the call to __morestack.
>  (define_insn "split_stack_return"
> -  [(unspec_volatile [(use (reg:SI 0))] UNSPECV_SPLIT_STACK_RETURN)]
> +  [(unspec_volatile [(use (reg:SI 0)) (use (reg:SI LR_REGNO))]
> + UNSPECV_SPLIT_STACK_RETURN)]

I'm not sure what a USE as input of an UNSPEC means -- it should work
without the USEs?

So please try without; otherwise okay of course.  Thanks!


Segher

Re: [PATCH][PR84213] DWARF: no location for non-definition DECLs with non-trivial DECL_VALUE_EXPR

2018-02-09 Thread Pierre-Marie de Rodat


On 02/09/2018 02:53 PM, Pierre-Marie de Rodat wrote:
Pasto, or maybe I didn’t properly configure Vim on the machine I used. 
;-) I’ll make sure these formatting issues are fixed before committing. 
Thanks!


Committed! That was a Thunderbird pasto. I tried to update the PR. 
Hoping I did that well…


--
Pierre-Marie de Rodat

[PATCH] Fix PR84037 some more

2018-02-09 Thread Richard Biener


This improves SLP detection when swapping of operands is needed to
match up stmts.  Formerly we only considered swapping the not "matched"
set of stmts but if we have a +- pair that might not have worked
(also for other reasons).  The following patch makes us instead see
if we can eventually swap the operands in the set of matched stmts.
This allows us to handle the + in the +- case as it happens in
capacita.

This doesn't get us to fully SLP the important loop but we now detect
three out of four instances compared to just one before.  This
results in a speedup of 9.5% when using AVX2 ... if there were not
the cost model rejecting the hybrid SLP vectorization now
(like it does in the SSE case even without this patch).  This
means we're now using interleaving for this loop which improves
runtime by "only" 7%.

One of the reasons of the not profitable vectorization is the
hybrid SLP which has shared stmts between the SLP instances and
the interleaving instances - so tackling the last missed SLP
group is next on my list.  But using interleaving is also
an improvement here.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2018-02-09  Richard Biener  

PR tree-optimization/84037
* tree-vect-slp.c (vect_build_slp_tree_2): Try swapping the
matched stmts if we cannot swap the non-matched ones.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 257520)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1308,37 +1308,65 @@ vect_build_slp_tree_2 (vec_info *vinfo,
  && nops == 2
  && oprnds_info[1]->first_dt == vect_internal_def
  && is_gimple_assign (stmt)
- && commutative_tree_code (gimple_assign_rhs_code (stmt))
- && ! two_operators
  /* Do so only if the number of not successful permutes was nor more
 than a cut-ff as re-trying the recursive match on
 possibly each level of the tree would expose exponential
 behavior.  */
  && *npermutes < 4)
{
- /* Verify if we can safely swap or if we committed to a specific
-operand order already.  */
- for (j = 0; j < group_size; ++j)
-   if (!matches[j]
-   && (swap[j] != 0
-   || STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmts[j]
- {
-   if (dump_enabled_p ())
- {
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"Build SLP failed: cannot swap operands "
-"of shared stmt ");
-   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
- stmts[j], 0);
- }
-   goto fail;
- }
+ /* See whether we can swap the matching or the non-matching
+stmt operands.  */
+ bool swap_not_matching = true;
+ do
+   {
+ for (j = 0; j < group_size; ++j)
+   {
+ if (matches[j] != !swap_not_matching)
+   continue;
+ gimple *stmt = stmts[j];
+ /* Verify if we can swap operands of this stmt.  */
+ if (!is_gimple_assign (stmt)
+ || !commutative_tree_code (gimple_assign_rhs_code (stmt)))
+   {
+ if (!swap_not_matching)
+   goto fail;
+ swap_not_matching = false;
+ break;
+   }
+ /* Verify if we can safely swap or if we committed to a
+specific operand order already.
+???  Instead of modifying GIMPLE stmts here we could
+record whether we want to swap operands in the SLP
+node and temporarily do that when processing it
+(or wrap operand accessors in a helper).  */
+ else if (swap[j] != 0
+  || STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt)))
+   {
+ if (!swap_not_matching)
+   {
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION,
+  vect_location,
+  "Build SLP failed: cannot swap "
+  "operands of shared stmt ");
+ dump_gimple_stmt (MSG_MISSED_OPTIMIZATION,
+   TDF_SLIM, stmts[j], 0);
+   }
+ goto fail;
+   }
+ swap_not_matching = false;
+ break;
+   }
+   }
+   }
+ w

Re: [PATCH][PR84213] DWARF: no location for non-definition DECLs with non-trivial DECL_VALUE_EXPR

2018-02-09 Thread Pierre-Marie de Rodat


On 02/09/2018 02:47 PM, Richard Biener wrote:

whitespace fixed here - vertical space missing before the comment
and the first line of the comment (or cut&paste error with the patch?)


Pasto, or maybe I didn’t properly configure Vim on the machine I used. 
;-) I’ll make sure these formatting issues are fixed before committing. 
Thanks!


--
Pierre-Marie de Rodat

Re: [PATCH][PR84213] DWARF: no location for non-definition DECLs with non-trivial DECL_VALUE_EXPR

2018-02-09 Thread Richard Biener

On Fri, 9 Feb 2018, Pierre-Marie de Rodat wrote:

> This patch restricts the set of cases in which we allow the generation of
> location attributes for variables that are not defined in the current unit.
> For such variables with complex DECL_VALUE_EXPR trees, generating a location
> attribute can end up creating relocations to text symbols in the debug section
> of LTO object files, which is not valid.
> 
> Richard originally suggested to revert r248792 and then enhance
> tree_add_const_value_attribute_for_decl to generate a DW_AT_location attribute
> in the specific case of a DECL_VALUE_EXPR that is an
> INDIRECT_REF(NOP_EXPR(INTEGER_CST)). I ended up with this patch because
> changing a function called "tree_add_const_value_attribute*" to generate
> instead a location attribute whereas there already exists a function called
> "add_location_or_const_value_attribute" felt really wrong. ;-) Especially if
> this would either re-implement part of the latter function in the former one.
> 
> What I did instead was to make dwarf2out_late_global_decl call
> add_location_or_const_value_attribute only when we know it's safe to do so
> (i.e. when it won't generate relocations to text symbols). I have the  feeling
> that it is cleaner and from what I could see, it fixes the reported issue with
> no regression.
> 
> Bootstrapped and regtested on x86_64-linux. Ok to commit to trunk?

Ok with ...

> gcc/
>   PR lto/84213
>   * dwarf2out.c (is_trivial_indirect_ref): New function.
>   (dwarf2out_late_global_decl): Do not generate a location attribute for
>   variables that have a non-trivial DECL_VALUE_EXPR and that are not
>   defined in the current unit.
> ---
>  gcc/dwarf2out.c | 31 +++
>  1 file changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 948b3cb..6538596 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -25716,6 +25716,23 @@ dwarf2out_early_global_decl (tree decl)
>symtab->global_info_ready = save;
>  }
>  +/* Return whether EXPR is an expression with the following pattern:
> +   INDIRECT_REF (NOP_EXPR (INTEGER_CST)).  */
> +

whitespace fixed here - vertical space missing before the comment
and the first line of the comment (or cut&paste error with the patch?)

Thanks,
Richard.

> +static bool
> +is_trivial_indirect_ref (tree expr)
> +{
> +  if (expr == NULL_TREE || TREE_CODE (expr) != INDIRECT_REF)
> +return false;
> +
> +  tree nop = TREE_OPERAND (expr, 0);
> +  if (nop == NULL_TREE || TREE_CODE (nop) != NOP_EXPR)
> +return false;
> +
> +  tree int_cst = TREE_OPERAND (nop, 0);
> +  return int_cst != NULL_TREE && TREE_CODE (int_cst) == INTEGER_CST;
> +}
> +
>  /* Output debug information for global decl DECL.  Called from
> toplev.c after compilation proper has finished.  */
>  @@ -25740,11 +25757,17 @@ dwarf2out_late_global_decl (tree decl)
>if (die)
>   {
> /* We get called via the symtab code invoking late_global_decl
> -  for symbols that are optimized out.  Do not add locations
> -  for those, except if they have a DECL_VALUE_EXPR, in which case
> -  they are relevant for debuggers.  */
> +  for symbols that are optimized out.
> +
> +  Do not add locations for those, except if they have a
> +  DECL_VALUE_EXPR, in which case they are relevant for debuggers.
> +  Still don't add a location if the DECL_VALUE_EXPR is not a
> trivial
> +  INDIRECT_REF expression, as this could generate relocations to
> +  text symbols in LTO object files, which is invalid.  */
> varpool_node *node = varpool_node::get (decl);
> -   if ((! node || ! node->definition) && ! DECL_HAS_VALUE_EXPR_P
> (decl))
> +   if ((! node || ! node->definition)
> +   && ! (DECL_HAS_VALUE_EXPR_P (decl)
> + && is_trivial_indirect_ref (DECL_VALUE_EXPR (decl
>   tree_add_const_value_attribute_for_decl (die, decl);
> else
>   add_location_or_const_value_attribute (die, decl, false);
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reduce inline limits a bit to compensate changes in inlining metrics

2018-02-09 Thread Jan Hubicka

Hi,
this patch addresses the code size regression by reducing 
max-inline-insns-auto 40->30 and increasing inline-min-speedup 8->15.

The main reason why we need retuning is following

 - inline-min-speedup works in a way that if expected runtime 
   of caller+calee combo after inlining reduces by more than 8%
   the inliner is going to bypass inline-insns-auto (because it knows the
   inlining is benefical rather than just inlining in hope it will be).
   The decrease either happens because callee is very lightweight at
   average or because we can track it will optimize well.

   During GCC 8 development I have switched time estimates from int to sreal.
   Original estimates was capping time to about 1000 instructions and thus
   large function rarely saw speedup because that was based comparing caped
   numbers.  With sreals we can now track benefits better

 - We made quite some progress on early optimizations making function
   bodies to appear smaller to inliner which in turn inlines more of them.
   This is reason why we want to decrease inline-min-speedup to gain some code
   size back.

   The code size estimate difference at beggining of inlning is about 6% to
   gcc 6 and about 12% to gcc 4.9.

I have benchmarked patch on Haswell SPEC2000, SPEC2006, polyhedron and our C++
benchmarks.  Here I found no off-noise changes on SPEC2000/2006. I know that
reducing inline-insns-auto to 10  still produces no regressions and even
improves facerec 6600->8000 but that seems bit of effect of good luck (it also
depends on setting of branch predictor weights and needs to be analyzed
independently).  min-speedup can be increased to 30 without measurable effects
as well.

On C++ benchmark suite I know that cray degrades with min-speedup set to 30 (it
needs value of 22). Also there is degradation with profile-generate on tramp3d.

So overall I believe that for Haswell the reduction of inline limits is doing
very consistent code size improvement without perofrmance tradeoffs.

I also tested Itanium and here things are slightly more sensitive. The
reduction of limits affects gzip 337->332 (-1.5%), vpr 1000->980 (-2%), crafty
(925->935) (+2%) and vortex (1165->1180) (+1%). So overall it is specint2000
neutral. Reducing inline-isns-auto to 10 brings off noise overall degradation
by -1% and 20 is in-between.

specfp2000 reacts positively by improving applu 520->525 (+1%) and mgrid
391->397 (+1.3%). It would let me to reduct inline-isns-auto to 10 without
any other regressions.

C++ benchmarks does not show any off-noise changes.

I have also did some limited testing on ppc and arm. They reacted more similarly
to Haswell also showing no important changes for reducing the inlining limits.

Now reducing inline limits triggers failure of testsuite/g++.dg/pr83239.C
which tests that inlining happens.  The reason why it does not happen is
becuae ipa-fnsplit is trying to second guess if inliner will evnetually consider
function for inlining and the test is out of date.  I decided to hack around
it for stage4 and will try to clean these things up next stage1.

Bootstraped/regtested x86_64-linux.  I know it is late in stage4, but would it
be OK to for GCC 8? 

PR middle-end/83665
* params.def (inline-min-speedup): Increase from 8 to 15.
(max-inline-insns-auto): Decrease from 40 to 30.
* ipa-split.c (consider_split): Add some buffer for function to
be considered inlining candidate.
* invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate
default values.
Index: params.def
===
--- params.def  (revision 257520)
+++ params.def  (working copy)
@@ -52,13 +52,13 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCO
 DEFPARAM (PARAM_INLINE_MIN_SPEEDUP,
  "inline-min-speedup",
  "The minimal estimated speedup allowing inliner to ignore 
inline-insns-single and inline-insns-auto.",
- 8, 0, 0)
+ 15, 0, 0)
 
 /* The single function inlining limit. This is the maximum size
of a function counted in internal gcc instructions (not in
real machine instructions) that is eligible for inlining
by the tree inliner.
-   The default value is 450.
+   The default value is 400.
Only functions marked inline (or methods defined in the class
definition for C++) are affected by this.
There are more restrictions to inlining: If inlined functions
@@ -77,11 +77,11 @@ DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
that is applied to functions marked inlined (or defined in the
class declaration in C++) given by the "max-inline-insns-single"
parameter.
-   The default value is 40.  */
+   The default value is 30.  */
 DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
  "max-inline-insns-auto",
  "The maximum number of instructions when automatically inlining.",
- 40, 0, 0)
+ 30, 0, 0)
 
 DEFPARAM (PARAM_MAX_INLINE_INSNS_RECURSIVE,
  "max-inline-insns-recursi

[PATCH][PR84213] DWARF: no location for non-definition DECLs with non-trivial DECL_VALUE_EXPR

2018-02-09 Thread Pierre-Marie de Rodat

This patch restricts the set of cases in which we allow the generation 
of location attributes for variables that are not defined in the current 
unit. For such variables with complex DECL_VALUE_EXPR trees, generating 
a location attribute can end up creating relocations to text symbols in 
the debug section of LTO object files, which is not valid.


Richard originally suggested to revert r248792 and then enhance 
tree_add_const_value_attribute_for_decl to generate a DW_AT_location 
attribute in the specific case of a DECL_VALUE_EXPR that is an 
INDIRECT_REF(NOP_EXPR(INTEGER_CST)). I ended up with this patch because 
changing a function called "tree_add_const_value_attribute*" to generate 
instead a location attribute whereas there already exists a function 
called "add_location_or_const_value_attribute" felt really wrong. ;-) 
Especially if this would either re-implement part of the latter function 
in the former one.


What I did instead was to make dwarf2out_late_global_decl call 
add_location_or_const_value_attribute only when we know it's safe to do 
so (i.e. when it won't generate relocations to text symbols). I have the 
 feeling that it is cleaner and from what I could see, it fixes the 
reported issue with no regression.


Bootstrapped and regtested on x86_64-linux. Ok to commit to trunk?

gcc/
PR lto/84213
* dwarf2out.c (is_trivial_indirect_ref): New function.
(dwarf2out_late_global_decl): Do not generate a location attribute for
variables that have a non-trivial DECL_VALUE_EXPR and that are not
defined in the current unit.
---
 gcc/dwarf2out.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 948b3cb..6538596 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -25716,6 +25716,23 @@ dwarf2out_early_global_decl (tree decl)
   symtab->global_info_ready = save;
 }
 +/* Return whether EXPR is an expression with the following pattern:
+   INDIRECT_REF (NOP_EXPR (INTEGER_CST)).  */
+
+static bool
+is_trivial_indirect_ref (tree expr)
+{
+  if (expr == NULL_TREE || TREE_CODE (expr) != INDIRECT_REF)
+return false;
+
+  tree nop = TREE_OPERAND (expr, 0);
+  if (nop == NULL_TREE || TREE_CODE (nop) != NOP_EXPR)
+return false;
+
+  tree int_cst = TREE_OPERAND (nop, 0);
+  return int_cst != NULL_TREE && TREE_CODE (int_cst) == INTEGER_CST;
+}
+
 /* Output debug information for global decl DECL.  Called from
toplev.c after compilation proper has finished.  */
 @@ -25740,11 +25757,17 @@ dwarf2out_late_global_decl (tree decl)
   if (die)
{
  /* We get called via the symtab code invoking late_global_decl
-for symbols that are optimized out.  Do not add locations
-for those, except if they have a DECL_VALUE_EXPR, in which case
-they are relevant for debuggers.  */
+for symbols that are optimized out.
+
+Do not add locations for those, except if they have a
+DECL_VALUE_EXPR, in which case they are relevant for debuggers.
+Still don't add a location if the DECL_VALUE_EXPR is not a trivial
+INDIRECT_REF expression, as this could generate relocations to
+text symbols in LTO object files, which is invalid.  */
  varpool_node *node = varpool_node::get (decl);
- if ((! node || ! node->definition) && ! DECL_HAS_VALUE_EXPR_P (decl))
+ if ((! node || ! node->definition)
+ && ! (DECL_HAS_VALUE_EXPR_P (decl)
+   && is_trivial_indirect_ref (DECL_VALUE_EXPR (decl
tree_add_const_value_attribute_for_decl (die, decl);
  else
add_location_or_const_value_attribute (die, decl, false);
--
2.1.4

Re: RFA: Sanitize deprecation messages (PR 84195)

2018-02-09 Thread Nick Clifton

Hi David, Hi Martin,

  OK, take 3 of the patch is attached.  In this version:

  * The escaping is handled by a new class.
  * Self-tests have been added (and they helped find a bug - yay!)
  * The documentation has been extended to mention -fmessage-length's
effect on the #-directives, and to add documentation for #pragma
GCC error and #pragma GCC warning.  (Which appeared to be missing).

  I did try to add backslash characters to the regexp in the new testcase
  but I just could not find a way to persuade dejagnu to accept them.

  OK to apply ?

  (Apologies for the poor C++ coding - it is not my strong point.  I am
  an assembler level programmer at heart).

Cheers
  Nick

gcc/ChangeLog
2018-02-09  Nick Clifton  

PR 84195
* tree.c (escaped_string): New class.  Converts an unescaped
string into its escaped equivalent.
(warn_deprecated_use): Use the new class to convert the
deprecation message, if present.
(test_escaped_strings): New self test.
(test_c_tests): Add test_escaped_strings.

gcc/testsuite/ChangeLog
2018-02-07  Nick Clifton  

* gcc.c-torture/compile/pr84195.c: New test.



pr84195.patch.3
Description: Unix manual page

PR84239, Reimplement CET intrinsics for rdssp/incssp insn

2018-02-09 Thread Tsimbalist, Igor V

Introduce a couple of new CET intrinsics for reading and updating a shadow stack
pointer (_get_ssp and _inc_ssp), which are more user friendly. They replace the 
existing
_rdssp[d|q] and _incssp[d|q] instrinsics. The _get_ssp intrinsic has more 
deterministic
semantic: it returns a value of the shadow stack pointer if HW is CET capable 
and
0 otherwise.

Ok for trunk?

Igor




0001-Reimplement-CET-intrinsics-for-rdssp-incssp-insn.patch
Description: 0001-Reimplement-CET-intrinsics-for-rdssp-incssp-insn.patch

[PR c/84293] Unexpected strict-alias warning

2018-02-09 Thread Nathan Sidwell

This bug comes from python sources.  It has some type-punning macros to 
switch between representation, but these can cause unexpected 
strict-alias warnings, even though the punning's context is a system 
header file.  (I guess that without macro expansion location information 
we'd've thought the context was user code.)


As shown in the bug report, if the pun is spelt '(struct X *)&obj' we do 
not warn, but if it is '(X_t *)&obj' we do.  We also don't warn in 
either case if we go via -save-temps.  The patch changes things to not 
warn in either spelling.


The problem is that strict_aliasing_warning uses input_location to 
identifier the current code point.  But that's not necessarily the 
location of the above pun -- it can be the point where we use the pun.


We cannot just use the incoming EXPR's location, because we may well 
have peeled some pieces off it, and we end up with the '&obj' fragment, 
which in this case is located in the user source.  It also turns out 
that strict_aliasing_warning no longer uses the incoming 'otype' 
argument, immediately doing:

   STRIP_NOPS (expr);
   otype = TREE_TYPE (expr);

So we may as well drop that arg, substituting a location_t from the caller.

That's what this patch does.  The callers in the C & C++ FEs pass the 
EXPR_LOCATION of the pointer being dereferenced, or converted, as 
appropriate.


Joseph, are the C bits ok?

nathan

--
Nathan Sidwell
2018-02-09  Nathan Sidwell  

	PR c/84293
	gcc/c/
	* c-typeck.c (build_indirect_ref, build_c_cast): Pass expr location
	to strict_aliasing_warning.

	gcc/c-family/
	* c-common.h (strict_aliasing_warning): Drop OTYPE arg, insert LOC
	arg.
	* c-warn.c (strict_aliasing_warning): Drop OTYPE arg, require LOC
	arg.  Adjust.

	gcc/cp/
	* typeck.c (cp_build_indirect_ref_1, build_reinterpret_cast_1):
	Pass expr location to strict_aliasing_warning.

	gcc/testsuite/
	* c-c++-common/pr84293.h: New.
	* c-c++-common/pr84293.c: New.

Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 257480)
+++ gcc/c/c-typeck.c	(working copy)
@@ -2524,7 +2524,7 @@ build_indirect_ref (location_t loc, tree
 	 the backend.  This only needs to be done at
 	 warn_strict_aliasing > 2.  */
 	  if (warn_strict_aliasing > 2)
-	if (strict_aliasing_warning (TREE_TYPE (TREE_OPERAND (pointer, 0)),
+	if (strict_aliasing_warning (EXPR_LOCATION (pointer),
 	 type, TREE_OPERAND (pointer, 0)))
 	  TREE_NO_WARNING (pointer) = 1;
 	}
@@ -5696,7 +5696,7 @@ build_c_cast (location_t loc, tree type,
 		"of different size");
 
   if (warn_strict_aliasing <= 2)
-strict_aliasing_warning (otype, type, expr);
+strict_aliasing_warning (EXPR_LOCATION (value), type, expr);
 
   /* If pedantic, warn for conversions between function and object
 	 pointer types, except for converting a null pointer constant
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h	(revision 257480)
+++ gcc/c-family/c-common.h	(working copy)
@@ -1260,7 +1260,7 @@ extern void warn_tautological_cmp (locat
 extern void warn_logical_not_parentheses (location_t, enum tree_code, tree,
 	  tree);
 extern bool warn_if_unused_value (const_tree, location_t);
-extern bool strict_aliasing_warning (tree, tree, tree);
+extern bool strict_aliasing_warning (location_t, tree, tree);
 extern void sizeof_pointer_memaccess_warning (location_t *, tree,
 	  vec *, tree *,
 	  bool (*) (tree, tree));
Index: gcc/c-family/c-warn.c
===
--- gcc/c-family/c-warn.c	(revision 257480)
+++ gcc/c-family/c-warn.c	(working copy)
@@ -599,17 +599,21 @@ warn_if_unused_value (const_tree exp, lo
 }
 }
 
-/* Print a warning about casts that might indicate violation
-   of strict aliasing rules if -Wstrict-aliasing is used and
-   strict aliasing mode is in effect. OTYPE is the original
-   TREE_TYPE of EXPR, and TYPE the type we're casting to. */
+/* Print a warning about casts that might indicate violation of strict
+   aliasing rules if -Wstrict-aliasing is used and strict aliasing
+   mode is in effect.  LOC is the location of the expression being
+   cast, EXPR might be from inside it.  TYPE is the type we're casting
+   to.  */
 
 bool
-strict_aliasing_warning (tree otype, tree type, tree expr)
+strict_aliasing_warning (location_t loc, tree type, tree expr)
 {
+  if (loc == UNKNOWN_LOCATION)
+loc = input_location;
+
   /* Strip pointer conversion chains and get to the correct original type.  */
   STRIP_NOPS (expr);
-  otype = TREE_TYPE (expr);
+  tree otype = TREE_TYPE (expr);
 
   if (!(flag_strict_aliasing
 	&& POINTER_TYPE_P (type)
@@ -628,8 +632,9 @@ strict_aliasing_warning (tree otype, tre
 	 if the cast breaks type based aliasing.  */
   if (!COMPLETE_TYPE_P (TREE_TYPE (type)) && warn_strict_aliasing == 2)
 	{
-	  warning (OPT_Wstrict_aliasi

[PATCH][C++] Fix PR84281

2018-02-09 Thread Richard Biener


The following patch optimizes the equal element case of 
cxx_eval_vec_init_1 to use a RANGE_EXPR in the built CONSTRUCTOR
instead of repeating the same element over and over.  This
cuts down memory use of the invalid testcase in the PR from
tens of GB to nothing and makes us rejct it instantanously
instead of by going OOM.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Ok for trunk?

Thanks,
Richard.

2018-02-09  Richard Biener  

PR c++/84281
* constexpr.c (cxx_eval_vec_init_1): Use a RANGE_EXPR to compact
uniform constructors and delay allocating them fully.

Index: gcc/cp/constexpr.c
===
--- gcc/cp/constexpr.c  (revision 257491)
+++ gcc/cp/constexpr.c  (working copy)
@@ -2885,7 +2885,6 @@ cxx_eval_vec_init_1 (const constexpr_ctx
   unsigned HOST_WIDE_INT max = tree_to_uhwi (array_type_nelts_top (atype));
   verify_ctor_sanity (ctx, atype);
   vec **p = &CONSTRUCTOR_ELTS (ctx->ctor);
-  vec_alloc (*p, max + 1);
   bool pre_init = false;
   unsigned HOST_WIDE_INT i;
 
@@ -2978,13 +2977,14 @@ cxx_eval_vec_init_1 (const constexpr_ctx
{
  if (new_ctx.ctor != ctx->ctor)
eltinit = new_ctx.ctor;
- for (i = 1; i < max; ++i)
-   {
- idx = build_int_cst (size_type_node, i);
- CONSTRUCTOR_APPEND_ELT (*p, idx, unshare_constructor (eltinit));
-   }
+ tree range = build2 (RANGE_EXPR, size_type_node,
+  build_int_cst (size_type_node, 1),
+  build_int_cst (size_type_node, max - 1));
+ CONSTRUCTOR_APPEND_ELT (*p, range, unshare_constructor (eltinit));
  break;
}
+  else
+   vec_safe_reserve (*p, max + 1);
 }
 
   if (!*non_constant_p)

[PATCH] POPCOUNT folding optimizations

2018-02-09 Thread Roger Sayle


The following patch implements a number of __builtin_popcount related
optimizations.
(i) popcount(x) == 0 can be simplified to x==0, and popcount(x) != 0 to
x!=0.
(ii) popcount(x&1) can be simplified to x&1, and for unsigned x,
popcount(x>>31) to x>>31.
(iii) popcount (x&6) + popcount(y&16) can be simplified to
popcount((x&6)|(y&16))

These may seem obscure transformations, but performing these types of
POPCOUNT
operations are often the performance critical steps in some cheminformatics
applications.

To implement the above transformations I've introduced the tree_nonzero_bits
function,
which is a tree-level version of rtlanal's nonzero_bits used by the RTL
optimizers.

The following patch has been tested on x86_64-pc-linux-gnu with a "make
bootstrap"
and "make check" with no regressions, and passes for the four new gcc.dg
test cases.

Many thanks In advance.  Best regards,

Roger
--
Roger Sayle, PhD.
NextMove Software Limited
Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY

2018-02-09  Roger Sayle  

* fold-const.c (tree_nonzero_bits): New function.
* fold-const.h (tree_nonzero_bits): Likewise.
* match.pd (POPCOUNT): New patterns to fold BUILTIN_POPCOUNT and
friends.  POPCOUNT(x&1) => x&1, POPCOUNT(x)==0 => x==0, etc.

2018-02-09  Roger Sayle  

* gcc.dg/fold-popcount-1.c: New testcase.
* gcc.dg/fold-popcount-2.c: New testcase.
* gcc.dg/fold-popcount-3.c: New testcase.
* gcc.dg/fold-popcount-4.c: New testcase.

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-original" } */

int test_eqzero(unsigned int a)
{
  return __builtin_popcount(a) == 0;
}

int test_eqzerol(unsigned long b)
{
  return __builtin_popcountl(b) == 0;
}

int test_eqzeroll(unsigned long long c)
{
  return __builtin_popcountll(c) == 0;
}

int test_nezero(unsigned int d)
{
  return __builtin_popcount(d) != 0;
}

int test_nezerol(unsigned long e)
{
  return __builtin_popcountl(e) != 0;
}

int test_nezeroll(unsigned long long f)
{
  return __builtin_popcountll(f) != 0;
}

/* { dg-final { scan-tree-dump-times "popcount" 0 "original" } } */

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-cddce1" } */

int test_andone(unsigned int a)
{
  return __builtin_popcount(a&1);
}

int test_andonel(unsigned long b)
{
  return __builtin_popcountl(b&1);
}

int test_andonell(unsigned long long c)
{
  return __builtin_popcountll(c&1);
}

int test_oneand(unsigned int d)
{
  return __builtin_popcount(1&d);
}

int test_oneandl(unsigned long e)
{
  return __builtin_popcountl(1&e);
}

int test_oneandll(unsigned long long f)
{
  return __builtin_popcountll(1&f);
}

/* { dg-final { scan-tree-dump-times "popcount" 0 "cddce1" } } */

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-cddce1" } */

int test_combine(unsigned int a, unsigned int b)
{
  return __builtin_popcount(a&8) + __builtin_popcount(b&2);
}

/* { dg-final { scan-tree-dump-times "popcount" 1 "cddce1" } } */

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-cddce1" } */

int test_shiftmax(unsigned int a)
{
  return __builtin_popcount(a>>(8*sizeof(a)-1));
}

int test_shiftmaxl(unsigned long b)
{
  return __builtin_popcountl(b>>(8*sizeof(b)-1));
}

int test_shiftmaxll(unsigned long long c)
{
  return __builtin_popcountll(c>>(8*sizeof(c)-1));
}

int test_shift7(unsigned char d)
{
  return __builtin_popcount(d>>7);
}

int test_shift7l(unsigned char e)
{
  return __builtin_popcountl(e>>7);
}

int test_shift7ll(unsigned char f)
{
  return __builtin_popcountll(f>>7);
}

int test_shift15(unsigned short g)
{
  return __builtin_popcount(g>>15);
}

int test_shift15l(unsigned short h)
{
  return __builtin_popcountl(h>>15);
}

int test_shift15ll(unsigned short i)
{
  return __builtin_popcountll(i>>15);
}

/* { dg-final { scan-tree-dump-times "popcount" 0 "cddce1" } } */

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 257227)
+++ gcc/fold-const.c(working copy)
@@ -14580,6 +14580,75 @@
   return string + offset;
 }
 
+/* Given a tree T, compute which bits in T may be nonzero.  */
+
+wide_int
+tree_nonzero_bits (const_tree t)
+{
+  switch (TREE_CODE (t))
+{
+case INTEGER_CST:
+  return wi::to_wide (t);
+case SSA_NAME:
+  return get_nonzero_bits (t);
+case NON_LVALUE_EXPR:
+case SAVE_EXPR:
+  return tree_nonzero_bits (TREE_OPERAND (t, 0));
+case BIT_AND_EXPR:
+  return wi::bit_and (tree_nonzero_bits (TREE_OPERAND (t, 0)),
+ tree_nonzero_bits (TREE_OPERAND (t, 1)));
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  return wi::bit_or (tree_nonzero_bits (TREE_OPERAND (t, 0)),
+tree_nonzero_bits (TREE_OPERAND (t, 1)));
+case COND_EXPR:
+  return wi::bit_or (tree_nonzero_bits (TREE_OPERAND (t, 1)),
+tree_nonzero_bits (TREE_OPERAND (t, 2)));
+case NOP_EXPR:
+case CONVERT_EXPR:
+  return wi

Re: [PATCH][GCC][ARM] Silence more re-definition warnings, make test case failure case more explicit.

2018-02-09 Thread Kyrill Tkachov


Hi Tamar,

On 07/02/18 18:35, Tamar Christina wrote:

Hi All,

The previous testcase would fail on a system where the initial mode is thumb 
and later
switches to an arm mode. This would again cause some warnings to be emitted.

This patch visits all builtins defined with builtin_define_with_int_value and 
undefines
them if they could possibly change by changing the architecture.

The testcase has also been updated to make it fail more easily on such cases.

Bootstrapped and regtested on arm-none-Linux-gnueabihf and no issues.

Ok for trunk?



Ok.
Thanks,
Kyrill


Thanks,
Tamar


gcc/
2018-02-07  Tamar Christina  

PR target/82641
* config/arm/arm-c.c (arm_cpu_builtins): Un-define __ARM_FEATURE_LDREX,
__ARM_ARCH_PROFILE, __ARM_ARCH_ISA_THUMB, __ARM_FP and __ARM_NEON_FP.

gcc/testsuite
2018-02-07  Tamar Christina  

PR target/82641
* gcc.target/arm/pragma_arch_switch_2.c: Use armv6 and armv5t.

--

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Alan Modra

On Fri, Feb 09, 2018 at 08:34:08AM -0200, Alexandre Oliva wrote:
>   * config/rs6000/rs6000.md (blockage): Set length to zero.

Thanks!  This fixed the ppc64le libdecnumber error for me.

-- 
Alan Modra
Australia Development Lab, IBM

[Committed] S/390: Fix PR84295

2018-02-09 Thread Andreas Krebbel

Bootstrapped and regression tested with --with-arch=z14 and
--with-arch=z900.  Glibc build is clean with that patch using
-mindirect-branch=thunk and -mfunction-return=thunk.

gcc/ChangeLog:

2018-02-09  Andreas Krebbel  

PR target/PR84295
* config/s390/s390.c (s390_set_current_function): Invoke
s390_indirect_branch_settings also if fndecl didn't change.

gcc/testsuite/ChangeLog:

2018-02-09  Andreas Krebbel  

PR target/PR84295
* gcc.target/s390/pr84295.c: New test.
---
 gcc/config/s390/s390.c  |  5 -
 gcc/testsuite/gcc.target/s390/pr84295.c | 14 ++
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr84295.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 62a60e2..298fdd1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16135,7 +16135,10 @@ s390_set_current_function (tree fndecl)
  several times in the course of compiling a function, and we don't want to
  slow things down too much or call target_reinit when it isn't safe.  */
   if (fndecl == s390_previous_fndecl)
-return;
+{
+  s390_indirect_branch_settings (fndecl);
+  return;
+}
 
   tree old_tree;
   if (s390_previous_fndecl == NULL_TREE)
diff --git a/gcc/testsuite/gcc.target/s390/pr84295.c 
b/gcc/testsuite/gcc.target/s390/pr84295.c
new file mode 100644
index 000..4da43c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr84295.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z900 -fgnu89-inline --save-temps 
-mfunction-return-reg=thunk -mindirect-branch-table" } */
+
+extern void foo (void);
+extern __inline  void foo (void) {}
+void foo (void) {}
+
+/* { dg-final { scan-assembler-times "jg\t__s390_indirect_jump" 1 } } */
+/* { dg-final { scan-assembler "ex\t" } } */
+
+/* { dg-final { scan-assembler-not "section\t.s390_indirect_jump" } } */
+/* { dg-final { scan-assembler-not "section\t.s390_indirect_call" } } */
+/* { dg-final { scan-assembler "section\t.s390_return_reg" } } */
+/* { dg-final { scan-assembler-not "section\t.s390_return_mem" } } */
-- 
2.9.1

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-09 Thread Jakub Jelinek

On Fri, Feb 09, 2018 at 11:40:29AM +0100, Richard Biener wrote:
> > I.e., having to track all pointers to d between the call to
> > strncpy and the assignment of the nul and make sure none of
> > them ends up used in a string function.  It didn't seem
> > the additional complexity would have been worth the effort
> > (or the likely false negatives).
> 
> Well, I'd just walk immediate uses of the VDEF of the
> strncpy call, not of the pointer argument.  There will be exactly _one_ 
> possible
> store (gimple_vdef () is non-NULL) that you need to verify (with using
> the current matching
> logic).  But it'll skip non-store statements for you.

Well, it should also punt on the immediate uses of the VDEF that
have NULL gimple_vdef and the alias oracle says that might alias with that,
i.e. warn about say
  strncpy (p, ...);
  foo (p);
  p[whatever] = '\0';
where foo is pure, because it might read the unterminated string, but don't
warn about
  strncpy (p, ...);
  x = *q;
  p[whatever] = '\0';
if q[0] can't alias with p.

Or just warn if there are any immediate uses of the strncpy VDEF that have
gimple_vdef NULL, or non-NULL and aren't the zero store you are looking for.

Jakub

Re: [patch] Do not generate libcalls to __unord[sd]f2 on PowerPC SPE

2018-02-09 Thread Richard Biener

On Fri, Feb 9, 2018 at 11:28 AM, Eric Botcazou  wrote:
> Hi,
>
> the change
>
> 2016-05-02  Marc Glisse  
>
> * match.pd (X u< X, X u> X): New transformations.
>
> has an annoying effect on targets implementing (most) unordered comparisons in
> hardware but not the UNORDERED operator itself, e.g. PowerPC SPE, because it
> makes the compiler generate unnecessary libcalls to __unord[sd]f2.
>
> The attached patch contains a trick to undo the effect of the change at the
> RTL level for such targets.  It was tested on PowerPC SPE with a certified
> version of VxWorks (which provides a minimal certified libgcc) where it fixes
> the link failure of ACATS c45242b with optimization enabled.
>
> OK for the mainline?

Ok.

Richard.

>
> 2018-02-09  Eric Botcazou  
>
> * optabs.c (prepare_cmp_insn): Try harder to emit a direct comparison
> instead of a libcall for UNORDERED.
>
> --
> Eric Botcazou

Re: [PATCH/RFC] tree-if-conv.c: fix two ICEs seen with -fno-tree-forwprop (PR tree-optimization/84178)

2018-02-09 Thread Richard Biener

On Thu, Feb 8, 2018 at 11:23 PM, David Malcolm  wrote:
> PR tree-optimization/84178 reports a couple of source files that ICE inside
> ifcvt when compiled with -03 -fno-tree-forwprop (trunk and gcc 7).
>
> Both cases involve problems with ifcvt's per-BB gimplified predicates.
>
> Testcase 1 fails this assertion within release_bb_predicate during cleanup:
>
> 283   if (flag_checking)
> 284 for (gimple_stmt_iterator i = gsi_start (stmts);
> 285  !gsi_end_p (i); gsi_next (&i))
> 286   gcc_assert (! gimple_use_ops (gsi_stmt (i)));
>
> The testcase contains a division in the loop, which leads to
> if_convertible_loop_p returning false (due to gimple_could_trap_p being true
> for the division).  This happens *after* the per-BB gimplified predicates
> have been created in predicate_bbs (loop).
> Hence tree_if_conversion bails out to "cleanup", but the gimplified predicates
> exist and make use of SSA names; for example this conjunction for two BB
> conditions:
>
>   _4 = h4.1_112 != 0;
>   _175 = (signed char) _117;
>   _176 = _175 >= 0;
>   _174 = _4 & _176;
>
> is using SSA names.

But then this shouldn't cause any stmt operands to be created - who is calling
update_stmt () on a stmt using the SSA names?  Maybe something calls
force_gimple_operand_gsi to add to the sequence?

> This assertion was added in r236498 (aka 
> c3deca2519d97c55876869c57cf11ae1e5c6cf8b):
>
> 2016-05-20  Richard Biener  
>
> * tree-if-conv.c (add_bb_predicate_gimplified_stmts): Use
> gimple_seq_add_seq_without_update.
> (release_bb_predicate): Assert we have no operands to free.
> (if_convertible_loop_p_1): Calculate post dominators later.
> Do not free BB predicates here.
> (combine_blocks): Do not recompute BB predicates.
> (version_loop_for_if_conversion): Save BB predicates around
> loop versioning.
>
> * gcc.dg/tree-ssa/ifc-cd.c: Adjust.
>
> The following patch fixes this by removing the assertion, and reinstating the
> cleanup of the operands.

But that was supposed to be not necessary...  I'll note that simply restoring
the old behavior is not ideal either - luckily we now have gimple_seq_discard ()
which should do a better job here (and actually does what the function comment
says!).

>
> Testcase 2 segfaults inside update_ssa when called from
> version_loop_for_if_conversion when a loop is versioned.
>
> During loop_version, some blocks are duplicated, and this can duplicate
> SSA names, leading to the duplicated SSA names being added to
> tree-into-ssa.c's old_ssa_names.
>
> For example, _117 is an SSA name defined in one of these to-be-duplicated
> blocks, and hence is added to old_ssa_names when duplicated.
>
> One of the BBs has this gimplified predicate (again, these stmts are *not*
> yet in a BB):
>   _173 = h4.1_112 != 0;
>   _171 = (signed char) _117;
>   _172 = _171 >= 0;
>   _170 = ~_172;
>   _169 = _170 & _173;
>
> Note the reference to SSA name _117 in the above.
>
> When update_ssa runs later on in version_loop_for_if_conversion,
> SSA name _117 is in the old_ssa_names bitmap, and thus has
> prepare_use_sites_for called on it:
>
> 2771  /* If an old name is in NAMES_TO_RELEASE, we cannot remove it from
> 2772 OLD_SSA_NAMES, but we have to ignore its definition site.  */
> 2773  EXECUTE_IF_SET_IN_BITMAP (old_ssa_names, 0, i, sbi)
> 2774{
> 2775  if (names_to_release == NULL || !bitmap_bit_p 
> (names_to_release, i))
> 2776prepare_def_site_for (ssa_name (i), insert_phi_p);
> 2777  prepare_use_sites_for (ssa_name (i), insert_phi_p);
> 2778}
>
> prepare_use_sites_for iterates over the immediate users, which includes
> the:
>   _171 = (signed char) _117;
> in the gimplified predicate.
>
> This tried to call "mark_block_for_update" on the BB of this def_stmt,
> leading to a read-through-NULL segfault, since this statement isn't in a
> BB yet.
>
> With the caveat that this is at the limit of my understanding of the
> innards of gimple, I'm wondering how this ever works: we have gimplified
> predicates that aren't in a BB yet, which typically refer to
> SSA names in the CFG proper, and we're attempting non-trivial manipulations
> of that CFG that can e.g. duplicate those SSA names.
>
> The following patch fixes the 2nd ICE by inserting the gimplified predicates
> earlier: immediately before the possible version_loop_for_if_conversion,
> rather than within combine_blocks.  That way they're in their BBs before
> we attempt any further manipulation of the CFG.

Hum, but that will alter both copies of the loops, no?  I think the fix is
to instead delay the update_ssa call to _after_ combine_blocks ()
(and remember if it is necessary just in case we didn't version).

Richard.

> This fixes the ICE, though it introduces a regression of
>   gcc.target/i386/avx2-vec-mask-bit-not.c
> which no longer vectorizes for some reason (I haven't investigated
>

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-09 Thread Richard Biener

On Thu, Feb 8, 2018 at 7:12 PM, Martin Sebor  wrote:
> On 02/08/2018 07:39 AM, Richard Biener wrote:
>>
>> On Thu, Feb 8, 2018 at 6:35 AM, Jeff Law  wrote:
>>>
>>> On 02/06/2018 05:57 AM, Jakub Jelinek wrote:

 On Tue, Feb 06, 2018 at 01:46:21PM +0100, Marek Polacek wrote:
>
> --- gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
> +++ gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
> @@ -0,0 +1,20 @@
> +/* PR tree-optimization/84228 */
> +/* { dg-do compile } */
> +/* { dg-options "-Wstringop-truncation -O2 -g" } */
> +
> +char *strncpy (char *, const char *, __SIZE_TYPE__);
> +struct S
> +{
> +  char arr[64];
> +};
> +
> +int
> +foo (struct S *p1, const char *a)
> +{
> +  if (a)
> +goto err;
> +  strncpy (p1->arr, a, sizeof p1->arr); /* { dg-bogus "specified
> bound" } */


 Just curious, what debug stmt is in between those?
 Wouldn't it be better to force them a couple of debug stmts?
 Say
   int b = 5, c = 6, d = 7;
 at the start of the function and
   b = 8; c = 9; d = 10;
 in between strncpy and the '\0' store?

> +  p1->arr[3] = '\0';
> +err:
> +  return 0;
> +}
> diff --git gcc/tree-ssa-strlen.c gcc/tree-ssa-strlen.c
> index c3cf432a921..f0f6535017b 100644
> --- gcc/tree-ssa-strlen.c
> +++ gcc/tree-ssa-strlen.c
> @@ -1849,7 +1849,7 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator
> gsi, tree src, tree cnt)
>
>/* Look for dst[i] = '\0'; after the stxncpy() call and if found
>   avoid the truncation warning.  */
> -  gsi_next (&gsi);
> +  gsi_next_nondebug (&gsi);
>gimple *next_stmt = gsi_stmt (gsi);
>
>if (!gsi_end_p (gsi) && is_gimple_assign (next_stmt))


 Ok for trunk, though generally looking at just next stmt is very
 fragile, might be
 better to look at the strncpy's vuse immediate uses if they are within
 the
 same basic block and either don't alias with it, or are the store it is
 looking for, or something similar.
>>>
>>> Martin and I wandered down this approach a bit and ultimately decided
>>> against it.  While yes it could avoid a false positive by looking at the
>>> immediate uses, but I'm not sure avoiding the false positive in those
>>> cases is actually good!
>>>
>>> THe larger the separation between the strcpy and the truncation the more
>>> likely it is that the code is wrong or at least poorly written and
>>> deserves a developer looksie.
>>
>>
>> But it's the next _GIMPLE_ stmt it looks at.  Make it
>>
>> char d[3];
>>
>>   void f (const char *s, int x)
>>   {
>> char d[x];
>> __builtin_strncpy (d, s, sizeof d);
>> d[x-1] = 0;
>>   }
>>
>> and I bet it will again warn since x-1 is a separate GIMPLE stmt.
>
>
> It doesn't but only because it doesn't know how to handle VLAs.
>
>> The patch is of course ok but still...  simply looking at all
>> immediate uses of the VDEF of the strncpy call, stopping
>> at the single stmt with the "next" VDEF should be better
>> (we don't have a next_vdef () helper).
>
>
> IIRC, one of the problems I ran into was how to handle code
> like this:
>
>   void f (const char *s)
>   {
> char d[8];
> char *p = strncpy (d, s, sizeof d);
>
> foo (p);
>
> d[7] = 0;
>   }
>
> I.e., having to track all pointers to d between the call to
> strncpy and the assignment of the nul and make sure none of
> them ends up used in a string function.  It didn't seem
> the additional complexity would have been worth the effort
> (or the likely false negatives).

Well, I'd just walk immediate uses of the VDEF of the
strncpy call, not of the pointer argument.  There will be exactly _one_ possible
store (gimple_vdef () is non-NULL) that you need to verify (with using
the current matching
logic).  But it'll skip non-store statements for you.

Richard.

> Martin
>

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-09 Thread Alexandre Oliva

On Feb  9, 2018, Jeff Law  wrote:

> On 02/08/2018 08:53 PM, Alan Modra wrote:
>> On Fri, Feb 09, 2018 at 01:21:27AM -0200, Alexandre Oliva wrote:
>>> Here's what I checked in, right after the LVU patch.
>>> 
>>> [IEPM] Introduce inline entry point markers
>> 
>> One of these two patches breaks ppc64le bootstrap with the assembler
>> complaining "Error: view number mismatch" when compiling
>> libdecnumber.
>> 
> I've just passed along a similar failure (.i, .s and command line
> options) to Alex for ppc64 (be) building glibc.

This fixes at least the testcase Jeff provided me with.  I'm going ahead
and checking it in as obvious.  I suppose we might need more of these,
on this and other ports, if they have been sloppy about zero-length
pseudo insns :-(

Would you guys please let me know whether you still see a problem, if
you get a chance to respin?  I was just about to crash in bed when I saw
your email.

When I get back up, I'll build the latest binutils release on ppc64,
ppc64el and aarch64, and then bootstrap gcc with it.  I should have done
that when I broadened my testing of the SFN+LVU+IEPM patchset to those 
platforms, but I didn't realize I was failing to test them with an
assembler with view support, doh!  Sorry about that.

for  gcc/ChangeLog

* config/rs6000/rs6000.md (blockage): Set length to zero.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 33f0d959f5d0..8aa4e0e7c71e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11063,7 +11063,8 @@
 (define_insn "blockage"
   [(unspec_volatile [(const_int 0)] UNSPECV_BLOCK)]
   ""
-  "")
+  ""
+  [(set_attr "length" "0")])

 (define_expand "probe_stack_address"
   [(use (match_operand 0 "address_operand"))]

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

[patch] Do not generate libcalls to __unord[sd]f2 on PowerPC SPE

2018-02-09 Thread Eric Botcazou

Hi,

the change

2016-05-02  Marc Glisse  

* match.pd (X u< X, X u> X): New transformations.

has an annoying effect on targets implementing (most) unordered comparisons in 
hardware but not the UNORDERED operator itself, e.g. PowerPC SPE, because it 
makes the compiler generate unnecessary libcalls to __unord[sd]f2.

The attached patch contains a trick to undo the effect of the change at the 
RTL level for such targets.  It was tested on PowerPC SPE with a certified 
version of VxWorks (which provides a minimal certified libgcc) where it fixes 
the link failure of ACATS c45242b with optimization enabled.

OK for the mainline?


2018-02-09  Eric Botcazou  

* optabs.c (prepare_cmp_insn): Try harder to emit a direct comparison
instead of a libcall for UNORDERED.

-- 
Eric BotcazouIndex: optabs.c
===
--- optabs.c	(revision 257404)
+++ optabs.c	(working copy)
@@ -3935,7 +3935,20 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx
   if (methods != OPTAB_LIB_WIDEN)
 goto fail;
 
-  if (!SCALAR_FLOAT_MODE_P (mode))
+  if (SCALAR_FLOAT_MODE_P (mode))
+{
+  /* Small trick if UNORDERED isn't implemented by the hardware.  */
+  if (comparison == UNORDERED && rtx_equal_p (x, y))
+	{
+	  prepare_cmp_insn (x, y, UNLT, NULL_RTX, unsignedp, OPTAB_WIDEN,
+			ptest, pmode);
+	  if (*ptest)
+	return;
+	}
+
+  prepare_float_lib_cmp (x, y, comparison, ptest, pmode);
+}
+  else
 {
   rtx result;
   machine_mode ret_mode;
@@ -3982,8 +3995,6 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx
   prepare_cmp_insn (x, y, comparison, NULL_RTX, unsignedp, methods,
 			ptest, pmode);
 }
-  else
-prepare_float_lib_cmp (x, y, comparison, ptest, pmode);
 
   return;

Fix ICE with scalar_storage_order attribute at -O

2018-02-09 Thread Eric Botcazou

This is a regression present on the mainline in the form of an assertion 
failure in optimize_bitfield_assignment_op with a scalar_storage_order 
attribute specifying the reverse endianness wrt the native endianness.

I put this assertion because I failed to cover the path at the time but things 
have changed since then and the attached testcase triggers it now (and also 
checks that the generated code is correct in this case).

Tested on x86_64-suse-linux, applied on the mainline.


2018-02-09  Eric Botcazou  

* expr.c (optimize_bitfield_assignment_op): Remove obsolete assertion.


2018-02-09  Eric Botcazou  

* gnat.dg/sso8.adb: New test.
* gnat.dg/sso8_pkg.ads: New helper.

-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 257404)
+++ expr.c	(working copy)
@@ -4726,8 +4726,6 @@ optimize_bitfield_assignment_op (poly_ui
 }
   else if (!REG_P (str_rtx) && GET_CODE (str_rtx) != SUBREG)
 return false;
-  else
-gcc_assert (!reverse);
 
   /* If the bit field covers the whole REG/MEM, store_field
  will likely generate better code.  */
with Interfaces;
with System;
with Unchecked_Conversion;

package SSO8_Pkg is

  Val8 : Interfaces.Unsigned_8;

  type Two_Bit_Int is range 0 .. 3;
  for Two_Bit_Int'size use 2;

  type Arr is array (1 .. 5) of Boolean;
  for Arr'scalar_storage_order use System.High_Order_First;
  pragma Pack (Arr);

  type Rec is record
Boolean_Data : Boolean;
Array_Data   : Arr;
Two_Bit_Data : Two_Bit_Int;
  end record;
  for Rec use record
Boolean_Data at 0 range 0 .. 0;
Array_Data   at 0 range 1 .. 5;
Two_Bit_Data at 0 range 6 .. 7;
  end record;
  for Rec'size use 8;
  for Rec'bit_order use System.High_Order_First;
  for Rec'scalar_storage_order use System.High_Order_First;

  function Conv is new Unchecked_Conversion (Rec, Interfaces.Unsigned_8);

end SSO8_Pkg;
-- { dg-do run }
-- { dg-options "-O" }

with Interfaces; use Interfaces;
with SSO8_Pkg; use SSO8_Pkg;

procedure SSO8 is
  Data : Rec;
begin
  Data.Array_Data (2) := True;
  Val8 := Conv (Data);
  if Val8 /= 32 then
raise Program_Error;
  end if;
end;

Re: RFA: Fix PR 68028: LTO error when compiling PowerPC binaries with single precision floating point

2018-02-09 Thread Nick Clifton

Hi Segher,


> I thought you were going to do a patch like the following, to make the
> e500 cores less special (they are not):

Sorry - my bad.  I defer to your patch then.  Whatever it takes to get
the BZ fixed and off the books... :-)

Cheers
  Nick

[PATCH] Adjust PR84278 testcase

2018-02-09 Thread Richard Biener


Got quoting of parens wrong.

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-02-09  Richard Biener  

PR tree-optimization/84278
* gcc.target/i386/pr84278.c: Adjust regex.

Index: gcc/testsuite/gcc.target/i386/pr84278.c
===
--- gcc/testsuite/gcc.target/i386/pr84278.c (revision 257491)
+++ gcc/testsuite/gcc.target/i386/pr84278.c (working copy)
@@ -15,4 +15,4 @@ void foo(void)
 }
 }
 
-/* { dg-final { scan-assembler-not "\(%.sp\)" } } */
+/* { dg-final { scan-assembler-not "\\\(%.sp\\\)" } } */

Re: [PATCH] Avoid BSWAP with floating point modes on rs6000 (PR target/84226)

2018-02-09 Thread Richard Biener

On Fri, 9 Feb 2018, Jakub Jelinek wrote:

> Hi!
> 
> BSWAP is documented as:
> 
> @item (bswap:@var{m} @var{x})
> Represents the value @var{x} with the order of bytes reversed, carried out
> in mode @var{m}, which must be a fixed-point machine mode.
> The mode of @var{x} must be @var{m} or @code{VOIDmode}.
> 
> The fixed-point is something used widely in rtl.texi and is very confusing
> now that we have FIXED_POINT_TYPE floats, I assume it talks about integral
> modes or, because it is also used with vector modes, about INTEGRAL_MODE_P
> modes.  My understanding is that bswap on a vector integral mode is meant to
> be bswap of each element individually.
> 
> The rs6000 backend uses bswap not just on scalar integral modes and
> vector integral modes, but also on V4SF and V2DF, but ICEs in simplify-rtx.c
> where we don't expect bswap to be used on SF/DFmode (vector bswap is handled
> as bswap of each element).
> 
> The following patch adjusts the rs6000 backend to use well defined bswaps
> on corresponding integral modes instead (and also does what we've done in
> i386 backend years ago, avoid subreg on the lhs because it breaks combine
> attempts to optimize it).
> 
> Or do we want to change documentation and simplify-rtx.c and whatever else
> in the middle-end to also make floating point bswaps well defined?

No, I think the current restriction is sound.

> How
> exactly, as bswaps of the underlying bits, i.e. for simplify-rtx.c as
> subreg of the constant to a corresponding integral mode, bswap in it and
> subreg back?  IMHO changing the rs6000 backend is easier and defining what
> exactly means a floating point bswap may be hard to understand.

Agreed.

> Bootstrapped/regtested on powerpc64{,le}-linux, on powerpc64-linux including
> -m64/-m32, ok for trunk?
> 
> 2018-02-09  Jakub Jelinek  
> 
>   PR target/84226
>   * config/rs6000/vsx.md (p9_xxbrq_v16qi): Change input operand
>   constraint from =wa to wa.  Avoid a subreg on the output operand,
>   instead use a pseudo and subreg it in a move.
>   (p9_xxbrd_): Changed to ...
>   (p9_xxbrd_v2di): ... this insn, without VSX_D iterator.
>   (p9_xxbrd_v2df): New expander.
>   (p9_xxbrw_): Changed to ...
>   (p9_xxbrw_v4si): ... this insn, without VSX_W iterator.
>   (p9_xxbrw_v4sf): New expander.
> 
>   * gcc.target/powerpc/pr84226.c: New test.
> 
> --- gcc/config/rs6000/vsx.md.jj   2018-01-22 23:57:21.299779544 +0100
> +++ gcc/config/rs6000/vsx.md  2018-02-08 17:21:13.197642776 +0100
> @@ -5311,35 +5311,60 @@ (define_insn "p9_xxbrq_v1ti"
>  
>  (define_expand "p9_xxbrq_v16qi"
>[(use (match_operand:V16QI 0 "vsx_register_operand" "=wa"))
> -   (use (match_operand:V16QI 1 "vsx_register_operand" "=wa"))]
> +   (use (match_operand:V16QI 1 "vsx_register_operand" "wa"))]
>"TARGET_P9_VECTOR"
>  {
> -  rtx op0 = gen_lowpart (V1TImode, operands[0]);
> +  rtx op0 = gen_reg_rtx (V1TImode);
>rtx op1 = gen_lowpart (V1TImode, operands[1]);
>emit_insn (gen_p9_xxbrq_v1ti (op0, op1));
> +  emit_move_insn (operands[0], gen_lowpart (V16QImode, op0));
>DONE;
>  })
>  
>  ;; Swap all bytes in each 64-bit element
> -(define_insn "p9_xxbrd_"
> -  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")
> - (bswap:VSX_D (match_operand:VSX_D 1 "vsx_register_operand" "wa")))]
> +(define_insn "p9_xxbrd_v2di"
> +  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
> + (bswap:V2DI (match_operand:V2DI 1 "vsx_register_operand" "wa")))]
>"TARGET_P9_VECTOR"
>"xxbrd %x0,%x1"
>[(set_attr "type" "vecperm")])
>  
> +(define_expand "p9_xxbrd_v2df"
> +  [(use (match_operand:V2DF 0 "vsx_register_operand" "=wa"))
> +   (use (match_operand:V2DF 1 "vsx_register_operand" "wa"))]
> +  "TARGET_P9_VECTOR"
> +{
> +  rtx op0 = gen_reg_rtx (V2DImode);
> +  rtx op1 = gen_lowpart (V2DImode, operands[1]);
> +  emit_insn (gen_p9_xxbrd_v2di (op0, op1));
> +  emit_move_insn (operands[0], gen_lowpart (V2DFmode, op0));
> +  DONE;
> +})
> +
>  ;; Swap all bytes in each 32-bit element
> -(define_insn "p9_xxbrw_"
> -  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
> - (bswap:VSX_W (match_operand:VSX_W 1 "vsx_register_operand" "wa")))]
> +(define_insn "p9_xxbrw_v4si"
> +  [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa")
> + (bswap:V4SI (match_operand:V4SI 1 "vsx_register_operand" "wa")))]
>"TARGET_P9_VECTOR"
>"xxbrw %x0,%x1"
>[(set_attr "type" "vecperm")])
>  
> +(define_expand "p9_xxbrw_v4sf"
> +  [(use (match_operand:V4SF 0 "vsx_register_operand" "=wa"))
> +   (use (match_operand:V4SF 1 "vsx_register_operand" "wa"))]
> +  "TARGET_P9_VECTOR"
> +{
> +  rtx op0 = gen_reg_rtx (V4SImode);
> +  rtx op1 = gen_lowpart (V4SImode, operands[1]);
> +  emit_insn (gen_p9_xxbrw_v4si (op0, op1));
> +  emit_move_insn (operands[0], gen_lowpart (V4SFmode, op0));
> +  DONE;
> +})
> +
>  ;; Swap all bytes in each element of vector
>  (define_expand "revb_"
> -  [(set (match_operand:VEC_REVB 0 "vsx_reg

Fix small regression in -fdump-ada-spec

2018-02-09 Thread Eric Botcazou

We cannot generate the 'constant' keyword for components in Ada, it's illegal.

Tested on x86_64-suse-linux, applied on the mainline.


2018-02-09  Eric Botcazou  

c-family/
* c-ada-spec.c (dump_ada_declaration): Do not generate the 'constant'
keyword for components.

-- 
Eric BotcazouIndex: c-family/c-ada-spec.c
===
--- c-family/c-ada-spec.c	(revision 257404)
+++ c-family/c-ada-spec.c	(working copy)
@@ -3131,7 +3131,7 @@ dump_ada_declaration (pretty_printer *bu
 	{
 	  pp_string (buffer, "aliased ");
 
-	  if (TREE_READONLY (t))
+	  if (TREE_READONLY (t) && TREE_CODE (t) != FIELD_DECL)
 		pp_string (buffer, "constant ");
 
 	  if (TYPE_NAME (TREE_TYPE (t)))
@@ -3147,7 +3147,7 @@ dump_ada_declaration (pretty_printer *bu
 		  || TREE_CODE (TREE_TYPE (t)) != INTEGER_TYPE))
 		pp_string (buffer, "aliased ");
 
-	  if (TREE_READONLY (t))
+	  if (TREE_READONLY (t) && TREE_CODE (t) != FIELD_DECL)
 		pp_string (buffer, "constant ");
 
 	  dump_generic_ada_node

79 matches

Mail list logo