Re: [PING] [PATCH] Fix asm X constraint (PR inline-asm/59155)

2016-06-20 Thread Bernd Edlinger
On 06/21/16 00:06, Jeff Law wrote:
> On 06/09/2016 10:45 AM, Jakub Jelinek wrote:
>> On Thu, Jun 09, 2016 at 06:43:04PM +0200, Jakub Jelinek wrote:
>>> Yes, I'm all in favor in disabling X constraint for inline asm.
>>> Especially if people actually try to print it as well, rather than
>>> make it
>>> unused.  That is a sure path to ICEs.
>>
>> Though, on the other side, even our documentation mentions
>> asm volatile ("mtfsf 255,%1" : "=X"(sum): "f"(fpenv));
>> So perhaps we need to error just in case such an argument is printed?
> Are you thinking to scan the output string for % for the appropriate
> ?  That shouldn't be too hard.  But that's not sufficient to address
> the problem Bernd is trying to tackle AFAICT.

Correct.

And furthermore, the use case with matching X input & output that Marc
wanted to use, seems to be a valid one.  Because "+X" allows more
registers than "+g" or "+r", it should create less register pressure.
And although it is probably unpredictable if a register of the
class "ALL_REGISTERS" will work for an assembler instruction, it is
still interesting to print it in an assembler comment.


Bernd.


[PATCH,rs6000] Add support for HAVE_AS_POWER9

2016-06-20 Thread Kelvin Nilsen


A "#define HAVE_AS_POWER9" or "#undef HAVE_AS_POWER9" preprocessor
directive is emitted into the $GCC_BUILD/gcc/auto-host.h file at
configuration time, depending on whether the available assembler
supports the Power9 instruction set.  This patch arranges to disable
Power9-specific compiler features if HAVE_AS_POWER9 is not defined.

The patch includes code to modify the behavior of the compiler along with
directives to adjust the treatment of certain dejagnu tests.  Disable
the Power9-specific tests on aix because of known incompatibilities.

This patch has bootstrapped and regression tested on
powerpc64le-unknown-linux-gnu with both a configuration that has a
Power9 assembler and one that does not have a Power9 assembler.  In
both cases, there were no regressions.  Is this ok for the trunk?  Is
this patch ok for gcc-6 after a few days of burn-in on the trunk?

Thanks.

gcc/ChangeLog:

2016-06-20  Kelvin Nilsen  

* config/rs6000/rs6000.h: Add conditional preprocessing directives
to disable Power9-specific compiler features if HAVE_AS_POWER9 is
not defined.

gcc/testsuite/ChangeLog:

2016-06-20  Kelvin Nilsen  

* gcc.target/powerpc/darn-0.c: Add dejagnu directives to disable
test if effective-target is not powerpc_p9vector_ok, or if a -mcpu
override other than -mcpu=power9 command-line option is specified,
or if the target operating system is aix.
* gcc.target/powerpc/darn-1.c: Likewise.
* gcc.target/powerpc/darn-2.c: Likewise.
* gcc.target/powerpc/vslv-0.c: Add dejagnu directives to disable
test if effective-target is not powerpc_p9vector_ok or if the
target operating system is not defined.
* gcc.target/powerpc/vslv-1.c: Likewise.
* gcc.target/powerpc/vsrv-0.c: Likewise.
* gcc.target/powerpc/vsrv-1.c: Likewise.

Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 237530)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -302,6 +302,26 @@ extern const char *host_detect_local_cpu (int argc
 #define TARGET_P8_VECTOR 0
 #endif
 
+/* Define the ISA 3.0 flags as 0 if the target assembler does not support
+   Power9 instructions.  Allow -mpower9-fusion, since it does not add new
+   instructions.  Allow -misel, since it predates ISA 3.0 and does
+   not require any Power9 features.  */
+
+#ifndef HAVE_AS_POWER9
+#undef  TARGET_FLOAT128_HW
+#undef  TARGET_MODULO
+#undef  TARGET_P9_VECTOR
+#undef  TARGET_P9_MINMAX
+#undef  TARGET_P9_DFORM_SCALAR
+#undef  TARGET_P9_DFORM_VECTOR
+#define TARGET_FLOAT128_HW 0
+#define TARGET_MODULO 0
+#define TARGET_P9_VECTOR 0
+#define TARGET_P9_MINMAX 0
+#define TARGET_P9_DFORM_SCALAR 0
+#define TARGET_P9_DFORM_VECTOR 0
+#endif
+
 /* Define TARGET_LWSYNC_INSTRUCTION if the assembler knows about lwsync.  If
not, generate the lwsync code as an integer constant.  */
 #ifdef HAVE_AS_LWSYNC
Index: gcc/testsuite/gcc.target/powerpc/darn-0.c
===
--- gcc/testsuite/gcc.target/powerpc/darn-0.c   (revision 237530)
+++ gcc/testsuite/gcc.target/powerpc/darn-0.c   (working copy)
@@ -1,4 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
 /* { dg-options "-mcpu=power9" } */
 
 /* This test should succeed on both 32- and 64-bit configurations.  */
Index: gcc/testsuite/gcc.target/powerpc/darn-1.c
===
--- gcc/testsuite/gcc.target/powerpc/darn-1.c   (revision 237530)
+++ gcc/testsuite/gcc.target/powerpc/darn-1.c   (working copy)
@@ -1,6 +1,9 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
 /* { dg-options "-mcpu=power9" } */
-/* { dg-require-effective-target lp64 } */
 
 #include 
 
Index: gcc/testsuite/gcc.target/powerpc/darn-2.c
===
--- gcc/testsuite/gcc.target/powerpc/darn-2.c   (revision 237530)
+++ gcc/testsuite/gcc.target/powerpc/darn-2.c   (working copy)
@@ -1,6 +1,9 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
 /* { dg-options "-mcpu=power9" } */
-/* { dg-require-effective-target lp64 } */
 
 #include 
 
Index: 

Fix ICE on conditional expression between DFP and non-DFP float (PR c/71601)

2016-06-20 Thread Joseph Myers
A conditional expression between DFP and non-DFP floating-point
produces an ICE.  This patch fixes this by making
build_conditional_expr return early when c_common_type produces an
error.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2016-06-20  Joseph Myers  

PR c/71601
* c-typeck.c (build_conditional_expr): Return error_mark_node if
c_common_type returns error_mark_node.

gcc/testsuite:
2016-06-20  Joseph Myers  

PR c/71601
* gcc.dg/dfp/usual-arith-conv-bad-3.c: New test.

Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c(revision 237589)
+++ gcc/c/c-typeck.c(working copy)
@@ -4846,6 +4846,8 @@ build_conditional_expr (location_t colon_loc, tree
   || code2 == COMPLEX_TYPE))
 {
   result_type = c_common_type (type1, type2);
+  if (result_type == error_mark_node)
+   return error_mark_node;
   do_warn_double_promotion (result_type, type1, type2,
"implicit conversion from %qT to %qT to "
"match other result of conditional",
Index: gcc/testsuite/gcc.dg/dfp/usual-arith-conv-bad-3.c
===
--- gcc/testsuite/gcc.dg/dfp/usual-arith-conv-bad-3.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/usual-arith-conv-bad-3.c   (working copy)
@@ -0,0 +1,13 @@
+/* Test error for conditional expression between DFP and other
+   floating operand.  */
+/* { dg-do compile } */
+
+_Decimal32 a;
+float b;
+int i;
+
+void
+f (void)
+{
+  (void) (i ? a : b); /* { dg-error "mix operands" } */
+}

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, vec-tails 01/10] New compiler options

2016-06-20 Thread Jeff Law

On 06/17/2016 04:41 AM, Ilya Enkovich wrote:



1. You've got 3 modes for epilogue vectorization.  Is this an artifact of
not really having good heuristics yet for which mode to apply to a
particular loop at this time?

2. Similarly for cost models.


All three modes are profitable in different situations.  Profitable mode depends
on a loop structure and target capabilities.  Ultimate goal is to have all three
modes enabled by default.  I can't state current heuristics are good enough
for all cases and targets and therefore don't enable epilogues vectorization
by default for now.  This is to be measured, analyzed and tuned in
time for GCC 7.1.




I add cost model simply to have an ability to force epilogue vectorization for
stability testing (force some mode of epilogue vectorization and check nothing
fails) and performance testing/tuning (try to find cases where we may benefit
from epilogue vectorization but don't due to bad cost model).  Also I don't
want to force epilogue vectorization for all loops for which vectorization is
forced using unlimited cost model because that may hurt performance for
simd loops.

Thanks.  That overview helps a lot.

We've done something similar to what you're doing with cost models for 
testing in the scheduler and other places in the past.   The costing 
models seem more geared towards us as developers rather than users.  you 
might consider keep those as local changes and not documenting them.


Understood completely on the modes.




Currently I have numbers collected on various suites for KNL machine.  Masking
mode (-ftree-vectorize-epilogues=mask) shows not bad results (dynamic
cost model,
-Ofast -flto -funroll-loops).  I don't see significant losses and there are few
significant gains.  For combine and nomask modes the result is not good enough
yet - there are several significant performance losses.  My guess is that
current threshold for combine is way too high and for nomask variant we better
choose the smallest vector size for epilogues instead of the next available
(use zmm for body and xmm for epilogue instead of zmmm for body and ymm for
epilogue).

ICC shows better results in these modes which makes me believe we can tune them
as well.  Overall nomask mode shows worse results comparing to options with
masking which is quite expected for KNL.

Unfortunately some big gains demonstrated by ICC are not reproducible
using GCC because we originally can't vectorize required hot loops.  E.g. on
200.sixtrack GCC has nothing and ICC has ~40% for all three modes.
I hadn't pondered that case.  Certainly if GCC isn't vectorizing as 
much, we're not going to have as many opportunities for optimizing the 
vec-tails.


Given the results with ICC, we're probably best off keeping all 3 modes 
and working to get them tuned correctly.





I don't have the whole statistics for Haswell but synthetic tests show the
situation is really different from KNL.  Even for the 'perfect' iterations count
number (VF * 2 - 1) scalar version of epilogue shows the same result as a masked
one.  It means ratio of vector code performance vs. scalar code performance is
not as high as for KNL (KNL is more vector oriented and has weaker
scalar performance,
double vector size also matters here) and masking cost is higher for Haswell.
We still focus on AVX-512 targets more because of their rich masking
capabilities and wider vector.

Understood.

Jeff


Re: [PATCH] PR52665 do not let .ident confuse assembler scan tests

2016-06-20 Thread Jeff Law

On 06/18/2016 01:31 PM, Bernhard Reutner-Fischer wrote:

A branch with a name matching scan-assembler pattern triggers
inappropriate FAIL.

E.g. branch fixups-testsuite and
- gcc.target/i386/pr65871-?.c (scan-assembler-not "test")
- gcc.target/i386/pr41442.c (scan-assembler-times "test|cmp" 2)
etc.

This is a recurring problem as can be seen by some -fno-ident additions
by commits from e.g. Michael Meissner over the years: builtins-58.c,
powerpc/pr46728-?.c

The patch below adds -fno-ident if a testcase contains one of
scan-assembler, scan-assembler-not or scan-assembler-times.

Regression tested on x86_64-unknown-linux on a fixups-testsuite branch
where it fixes several false FAILs without regressions.

gcc/testsuite/ChangeLog

2016-06-18  Bernhard Reutner-Fischer  

PR testsuite/52665
* lib/gcc-dg.exp (gcc-dg-test-1): Iterate over _required_options.
* lib/target-supports.exp (scan-assembler_required_options,
scan-assembler-not_required_options,
scan-assembler-times_required_options): Add -fno-ident.
* lib/scanasm.exp (scan-assembler-times): Fix error message.
* c-c++-common/ident-0a.c: New test.
* c-c++-common/ident-0b.c: New test.
* c-c++-common/ident-1a.c: New test.
* c-c++-common/ident-1b.c: New test.
* c-c++-common/ident-2a.c: New test.
* c-c++-common/ident-2b.c: New test.

Ok for trunk?

PS: proc force_conventional_output_for would be a bit misnomed by this,
not sure if it should be renamed to maybe set_required_options_for or
the like?

OK.

Changing force_conventional_output to set_required_options_for is 
pre-approved as well.


jeff



Re: [PATCH] Give up instead of ICE on invalid stringops attributes (PR tree-optimization/71588)

2016-06-20 Thread Jeff Law

On 06/20/2016 12:35 PM, Jakub Jelinek wrote:

Hi!

If users use attributes like const or pure incorrectly on stringops
builtins, the tree-ssa-strlen.c pass can ICE, because it expects it can e.g.
replace a strcpy (which should not be const or pure) with memcpy (which also
shouldn't be const/pure) etc.
The patch just pretends the calls aren't builtins for the purpose of
tree-ssa-strlen.c pass if they have unexpected const/pure-ness.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-06-20  Jakub Jelinek  

PR tree-optimization/71588
* tree-ssa-strlen.c (valid_builtin_call): New function.
(adjust_last_stmt, handle_builtin_memset, strlen_optimize_stmt): Use
it.

* gcc.dg/pr71558.c: New test.

OK.
jeff






Re: [PING] [PATCH] Fix asm X constraint (PR inline-asm/59155)

2016-06-20 Thread Jeff Law

On 06/09/2016 10:45 AM, Jakub Jelinek wrote:

On Thu, Jun 09, 2016 at 06:43:04PM +0200, Jakub Jelinek wrote:

Yes, I'm all in favor in disabling X constraint for inline asm.
Especially if people actually try to print it as well, rather than make it
unused.  That is a sure path to ICEs.


Though, on the other side, even our documentation mentions
asm volatile ("mtfsf 255,%1" : "=X"(sum): "f"(fpenv));
So perhaps we need to error just in case such an argument is printed?
Are you thinking to scan the output string for % for the appropriate 
?  That shouldn't be too hard.  But that's not sufficient to address 
the problem Bernd is trying to tackle AFAICT.



Jeff


Re: RFC: pass to warn on questionable uses of alloca().

2016-06-20 Thread Jeff Law

On 06/18/2016 05:55 PM, Martin Sebor wrote:



I think detecting potentially problematic uses of alloca would
be useful, especially when done in an intelligent way like in
your patch (as opposed to simply diagnosing every call to
the function regardless of the value of its argument).  At
the same time, it seems that an even more reliable solution
than pointing out potentially unsafe calls to the function
and relying on users to modify their code to use malloc for
large/unbounded allocations would be to let GCC do it for
them automatically (i.e., in response to some other option,
emit a call to malloc instead, and insert a call to free when
appropriate).
As Joseph pointed out, this may not be valid in certain contexts. 
THough it is an interesting idea.




I found the "warning: unbounded use of alloca" misleading when
a call to the function was, in fact, bounded but to a limit
that's greater than alloca-max-size as in the program below:

  void f (void*);

  void g (int n)
  {
void *p;
if (n < 4096)
  p = __builtin_alloca (n);
else
  p = __builtin_malloc (n);
f (p);
  }
  t.C: In function ‘g’:
  t.C:7:7: warning: unbounded use of alloca [-Walloca]
   p = __builtin_alloca (n);

I would suggest to rephrase the diagnostic to mention the limit,
e.g.,

  warning: calling alloca with an argument in excess of '4000'
  bytes

Agreed.



As a separate enhancement, since in the (idiomatic) example
above the malloc memory is allowed to leak, issuing a distinct
warning for it would help detect this class of bugs that's
likely to be common as users replace unbounded uses of alloca
with malloc in response to the new option and forget to free
the memory.  Diagnosing freeing the alloca memory would be
a nice touch as well.
Aldy and I have been kicking around additional warnings in this space. 
It's a good idea I'll encourage him to add.


I also think that VLA diagnostics would be better controlled
by a separate option, and emit a different diagnostic (one
that mentions VLA rather than alloca).
Probably wise -- I wouldn't necessarily expect everyone to understand 
the relationship between the VLA language construct and how it's 
implemented in GCC in terms of stack allocations.




 Although again, and

for VLAs even more so than for alloca, providing an option
to have GCC use dynamic allocation, would be an even more
robust solution than issuing warnings.  IIRC, this was the
early implementation of VLAs in GCC so there is a precedent
for it.  (Though this seems complementary to the warnings.)
In addition, I'm of the opinion that potentially unbounded
VLA allocation should be checked at runtime and made trap on
size overflow in C and throw an exception in C++ (e.g., when
int a [A][B] when A * B * sizeof (int) exceeds SIZE_MAX / 2
or some runtime-configurable limit).  My C++ patch for bug
69517 does just that (it needs to be resubmitted with the
runtime configuration limit added).

Right.

Jeff


Re: RFC: pass to warn on questionable uses of alloca().

2016-06-20 Thread Jeff Law

On 06/20/2016 08:56 AM, Joseph Myers wrote:

On Sat, 18 Jun 2016, Martin Sebor wrote:


the function regardless of the value of its argument).  At
the same time, it seems that an even more reliable solution
than pointing out potentially unsafe calls to the function
and relying on users to modify their code to use malloc for
large/unbounded allocations would be to let GCC do it for
them automatically (i.e., in response to some other option,
emit a call to malloc instead, and insert a call to free when
appropriate).


Note that such an option would not be usable for the original motivating
case of glibc, because in code that's meant to be async-signal-safe,
alloca and VLAs can be used, but malloc cannot.
I've actually considered the other direction more viable.  Define the 
right set of constraints and let the compiler optimize from malloc/free 
to alloca.


For the uses of alloca in glibc that have to be async-signal-safe, we 
should just leave those alone no matter what we may or may not be able 
to prove.


jeff



Re: [PATCH] Don't run -fself-test with -E (PR rtl-optimization/71591)

2016-06-20 Thread Jeff Law

On 06/20/2016 12:38 PM, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, with -E (C family and Fortran FEs, others don't
preprocess) ask the middle-end not to initialize the backends, so running
e.g. RTL tests leads to ICEs, e.g. pc_rtx and many other things just aren't
initialized.

2016-06-20  Jakub Jelinek  

PR rtl-optimization/71591
* toplev.c (toplev::run_self_tests): If no_backend, complain and
don't run any tests.

* gcc.dg/cpp/pr71591.c: New test.

OK.
jeff



Re: [PATCH] Fix warn uninit ICE with _Complex exprs (PR middle-end/71581)

2016-06-20 Thread Jeff Law

On 06/20/2016 12:45 PM, Jakub Jelinek wrote:

Hi!

On the following testcase we ICE during warn_uninit.
Normally, has_undefined_value_p returns false for anonymous SSA_NAMEs,
so NULL expr/var aren't a problem.  If t has a COMPLEX_EXPR as def-stmt,
where the first operand is some scalar var's (D) SSA_NAME and the second
operand is 0, which results from gimplification of conversion of scalar
uninitialized var to complex type, t is anonymous SSA_NAME and expr and var
are both NULL.

This patch attempts to deal with that, try to recognize the case and use
the other SSA_NAME's underlying var as expr/var in that case, or punt (in
the unlikely case this wouldn't help).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-06-20  Jakub Jelinek  

PR middle-end/71581
* tree-ssa-uninit.c (warn_uninit): If EXPR and VAR are NULL,
see if T isn't anonymous SSA_NAME with COMPLEX_EXPR created
for conversion of scalar user var to complex type and use the
underlying SSA_NAME_VAR in that case.  If EXPR is still NULL,
punt.

* gcc.dg/pr71581.c: New test.

OK.
jeff



Re: [PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread H.J. Lu
On Mon, Jun 20, 2016 at 12:46 PM, Richard Sandiford
 wrote:
> Uros Bizjak  writes:
>> On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu  wrote:
>>> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak  wrote:
 On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu  wrote:
> Hi,
>
> This patch implements the alternate code sequence recommended in
>
> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>
> to load external function address via GOT slot with
>
> movq func@GOTPCREL(%rip), %rax
>
> so that linker won't create an PLT entry for extern function
> address.
>
> Tested on x86-64.  OK for trunk?

> +  else if (ix86_force_load_from_GOT_p (op1))
> +{
> +  /* Load the external function address via the GOT slot to
> +avoid PLT.  */
> +  op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
> +   (TARGET_64BIT
> +? UNSPEC_GOTPCREL
> +: UNSPEC_GOT));
> +  op1 = gen_rtx_CONST (Pmode, op1);
> +  op1 = gen_const_mem (Pmode, op1);
> +  /* This symbol must be referenced via a load from the Global
> +Offset Table.  */
> +  set_mem_alias_set (op1, ix86_GOT_alias_set ());
> +  op1 = convert_to_mode (mode, op1, 1);
> +  op1 = force_reg (mode, op1);
> +  emit_insn (gen_rtx_SET (op0, op1));
> +  /* Generate a CLOBBER so that there will be no REG_EQUAL note
> +on the last insn to prevent cse and fwprop from replacing
> +a GOT load with a constant.  */
> +  rtx tmp = gen_reg_rtx (Pmode);
> +  emit_clobber (tmp);
> +  return;

 Jeff, is this the recommended way to prevent CSE, as far as RTL
 infrastructure is concerned? I didn't find any example of this
 approach with other targets.

>>>
>>> FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
>>> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
>>> as well as other targets:
>>>
>>> frv/frv.c:  emit_clobber (op0);
>>> frv/frv.c:  emit_clobber (op1);
>>> im32c/m32c.c:  /*  emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
>>> s390/s390.c:  emit_clobber (addr);
>>> s390/s390.md:  emit_clobber (reg0);
>>> s390/s390.md:  emit_clobber (reg1);
>>> s390/s390.md:  emit_clobber (reg0);
>>> s390/s390.md:  emit_clobber (reg0);
>>> s390/s390.md:  emit_clobber (reg1);
>>
>> These usages mark the whole register as being "clobbered"
>> (=undefined), before only a part of register is written, e.g.:
>>
>>   emit_clobber (int_xmm);
>>   emit_move_insn (gen_lowpart (DImode, int_xmm), input);
>>
>> They aren't used to prevent unwanted CSE.
>
> Since it's being called in the move expander, I thought the normal
> way of preventing the constant being rematerialised would be to reject
> it in the move define_insn predicates.
>
> FWIW, I agree that using a clobber for this is going to be fragile.
>

Here is the alternative from clobber.


-- 
H.J.
--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a68983c..7df 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2347,7 +2347,7 @@
 (define_insn "*movsi_internal"
   [(set (match_operand:SI 0 "nonimmediate_operand"
  "=r,m ,*y,*y,?rm,?*y,*v,*v,*v,m ,?r ,?r,?*Yi,*k  ,*rm")
- (match_operand:SI 1 "general_operand"
+ (match_operand:SI 1 "ix86_general_operand"
  "g ,re,C ,*y,*y ,rm ,C ,*v,m ,*v,*Yj,*v,r   ,*krm,*k"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
@@ -2564,7 +2564,7 @@
 (define_insn "*movqi_internal"
   [(set (match_operand:QI 0 "nonimmediate_operand"
  "=q,q ,q ,r,r ,?r,m ,k,k,r ,m,k")
- (match_operand:QI 1 "general_operand"
+ (match_operand:QI 1 "ix86_general_operand"
  "q ,qn,qm,q,rn,qm,qn,r ,k,k,k,m"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 06a0002..a471deb 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -367,6 +367,12 @@
 }
 })

+;; Return true if OP is general operand representable on ix86
+(define_predicate "ix86_general_operand"
+  (and (match_operand 0 "general_operand")
+   (ior (not (match_code "symbol_ref"))
+(match_test "!ix86_force_load_from_GOT_p (op)"
+
 ;; Return true if size of VALUE can be stored in a sign
 ;; extended immediate field.
 (define_predicate "x86_64_immediate_size_operand"
@@ -1036,6 +1042,9 @@
   struct ix86_address parts;
   int ok;

+  if (ix86_force_load_from_GOT_p (op))
+return false;
+
   if (!CONST_INT_P (op)
   && mode != VOIDmode
   && GET_MODE (op) != mode)


Re: [PATCH 6/6] loop-iv.c: make cond_list a vec

2016-06-20 Thread Richard Sandiford
tbsaunde+...@tbsaunde.org writes:
> diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
> index 57fb8c1..21c3180 100644
> --- a/gcc/loop-iv.c
> +++ b/gcc/loop-iv.c
> @@ -1860,7 +1860,6 @@ simplify_using_initial_values (struct loop *loop, enum 
> rtx_code op, rtx *expr)
>  {
>bool expression_valid;
>rtx head, tail, last_valid_expr;
> -  rtx_expr_list *cond_list;
>rtx_insn *insn;
>rtx neutral, aggr;
>regset altered, this_altered;
> @@ -1936,7 +1935,7 @@ simplify_using_initial_values (struct loop *loop, enum 
> rtx_code op, rtx *expr)
>  
>expression_valid = true;
>last_valid_expr = *expr;
> -  cond_list = NULL;
> +  auto_vec cond_list;
>while (1)
>  {
>insn = BB_END (e->src);

How about using "auto_vec" for some small N, since we expect
cond_list to be used fairly often?

Wish I knew whether there was supposed to be a space before "<"...

> @@ -1988,39 +1988,30 @@ simplify_using_initial_values (struct loop *loop, 
> enum rtx_code op, rtx *expr)
>  
> if (suitable_set_for_replacement (insn, , ))
>   {
> -   rtx_expr_list **pnote, **pnote_next;
> -
> replace_in_expr (expr, dest, src);
> if (CONSTANT_P (*expr))
>   goto out;
>  
> -   for (pnote = _list; *pnote; pnote = pnote_next)
> +   unsigned int len = cond_list.length ();
> +   for (unsigned int i = len - 1; i < len; i--)
>   {
> -   rtx_expr_list *note = *pnote;
> -   rtx old_cond = XEXP (note, 0);
> +   rtx old_cond = cond_list[i];
>  
> -   pnote_next = (rtx_expr_list **) (note, 1);
> -   replace_in_expr ( (note, 0), dest, src);
> +   replace_in_expr (_list[i], dest, src);
>  
> /* We can no longer use a condition that has been simplified
>to a constant, and simplify_using_condition will abort if
>we try.  */
> -   if (CONSTANT_P (XEXP (note, 0)))
> - {
> -   *pnote = *pnote_next;
> -   pnote_next = pnote;
> -   free_EXPR_LIST_node (note);
> - }
> +   if (CONSTANT_P (cond_list[i]))
> + cond_list.ordered_remove (i);

Do we really need ordered removes here and below?  Obviously it turns
the original O(1) operation into O(n), and it wasn't obvious from first
glance that the order of the conditions was relevant.

Thanks,
Richard


Committed, CRIS: fix target/71571, delay-slot nop in PIC MI thunk

2016-06-20 Thread Hans-Peter Nilsson
Committed to trunk.  Apparently the -fno-inline is key to
keeping the test-case small.
Thanks go to the reporter, David B. Robins.

gcc:
PR target/71571
* config/cris/cris.c (cris_asm_output_mi_thunk): Add missing "ba"
delay-slot "nop" for PIC with CRIS v32.  Also add missing leading
space for PIC with non-v32 and the common non-PIC "jump".

gcc/testsuite:
PR target/71571
* g++.dg/torture/pr71571.C: New test.

Index: gcc/config/cris/cris.c
===
--- gcc/config/cris/cris.c  (revision 235415)
+++ gcc/config/cris/cris.c  (working copy)
@@ -2772,18 +2772,18 @@ cris_asm_output_mi_thunk (FILE *stream,
{
  fprintf (stream, "\tba ");
  assemble_name (stream, name);
- fprintf (stream, "%s\n", CRIS_PLT_PCOFFSET_SUFFIX);
+ fprintf (stream, "%s\n\tnop\n", CRIS_PLT_PCOFFSET_SUFFIX);
}
   else
{
- fprintf (stream, "add.d ");
+ fprintf (stream, "\tadd.d ");
  assemble_name (stream, name);
  fprintf (stream, "%s,$pc\n", CRIS_PLT_PCOFFSET_SUFFIX);
}
 }
   else
 {
-  fprintf (stream, "jump ");
+  fprintf (stream, "\tjump ");
   assemble_name (stream, XSTR (XEXP (DECL_RTL (funcdecl), 0), 0));
   fprintf (stream, "\n");
 
--- /dev/null   Tue Oct 29 15:57:07 2002
+++ pr71571.C   Mon Jun 20 21:50:12 2016
@@ -0,0 +1,43 @@
+// { dg-do run }
+// { dg-options "-fno-inline" { target { ! fpic } } }
+// { dg-options "-fpic -fno-inline" { target fpic } }
+
+class XBase
+{
+public:
+ virtual void FuncA() = 0;
+};
+
+class Y
+{
+protected:
+ virtual void FuncB() {}
+};
+
+class X1 : public Y, public XBase
+{
+public:
+ void FuncA() {}
+};
+
+class X2 : public XBase
+{
+public:
+ X2(XBase ) : m_xb(xb) { }
+ void FuncA()
+ {
+  m_xb.FuncA();
+ }
+
+private:
+ XBase _xb;
+};
+
+
+int main()
+{
+ X1 x1;
+ X2 x2(x1);
+ XBase *pxb = 
+ pxb->FuncA();
+}

brgds, H-P


Re: [PATCH 5/6] make pattern_regs a vec

2016-06-20 Thread Richard Sandiford
tbsaunde+...@tbsaunde.org writes:
> @@ -265,18 +261,16 @@ store_ops_ok (const_rtx x, int *regs_set)
>  /* Returns a list of registers mentioned in X.
> FIXME: A regset would be prettier and less expensive.  */
>
> -static rtx_expr_list *
> -extract_mentioned_regs (rtx x)
> +static void
> +extract_mentioned_regs (rtx x, vec *mentioned_regs)
>  {
> -  rtx_expr_list *mentioned_regs = NULL;
>subrtx_var_iterator::array_type array;
>FOR_EACH_SUBRTX_VAR (iter, array, x, NONCONST)
>  {
>rtx x = *iter;
>if (REG_P (x))
> - mentioned_regs = alloc_EXPR_LIST (0, x, mentioned_regs);
> + mentioned_regs->safe_push (x);
>  }
> -  return mentioned_regs;
>  }

The function comment needs to be updated.

Thanks,
Richard



Re: [PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread Richard Sandiford
Uros Bizjak  writes:
> On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu  wrote:
>> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak  wrote:
>>> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu  wrote:
 Hi,

 This patch implements the alternate code sequence recommended in

 https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI

 to load external function address via GOT slot with

 movq func@GOTPCREL(%rip), %rax

 so that linker won't create an PLT entry for extern function
 address.

 Tested on x86-64.  OK for trunk?
>>>
 +  else if (ix86_force_load_from_GOT_p (op1))
 +{
 +  /* Load the external function address via the GOT slot to
 +avoid PLT.  */
 +  op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
 +   (TARGET_64BIT
 +? UNSPEC_GOTPCREL
 +: UNSPEC_GOT));
 +  op1 = gen_rtx_CONST (Pmode, op1);
 +  op1 = gen_const_mem (Pmode, op1);
 +  /* This symbol must be referenced via a load from the Global
 +Offset Table.  */
 +  set_mem_alias_set (op1, ix86_GOT_alias_set ());
 +  op1 = convert_to_mode (mode, op1, 1);
 +  op1 = force_reg (mode, op1);
 +  emit_insn (gen_rtx_SET (op0, op1));
 +  /* Generate a CLOBBER so that there will be no REG_EQUAL note
 +on the last insn to prevent cse and fwprop from replacing
 +a GOT load with a constant.  */
 +  rtx tmp = gen_reg_rtx (Pmode);
 +  emit_clobber (tmp);
 +  return;
>>>
>>> Jeff, is this the recommended way to prevent CSE, as far as RTL
>>> infrastructure is concerned? I didn't find any example of this
>>> approach with other targets.
>>>
>>
>> FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
>> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
>> as well as other targets:
>>
>> frv/frv.c:  emit_clobber (op0);
>> frv/frv.c:  emit_clobber (op1);
>> im32c/m32c.c:  /*  emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
>> s390/s390.c:  emit_clobber (addr);
>> s390/s390.md:  emit_clobber (reg0);
>> s390/s390.md:  emit_clobber (reg1);
>> s390/s390.md:  emit_clobber (reg0);
>> s390/s390.md:  emit_clobber (reg0);
>> s390/s390.md:  emit_clobber (reg1);
>
> These usages mark the whole register as being "clobbered"
> (=undefined), before only a part of register is written, e.g.:
>
>   emit_clobber (int_xmm);
>   emit_move_insn (gen_lowpart (DImode, int_xmm), input);
>
> They aren't used to prevent unwanted CSE.

Since it's being called in the move expander, I thought the normal
way of preventing the constant being rematerialised would be to reject
it in the move define_insn predicates.

FWIW, I agree that using a clobber for this is going to be fragile.

Thanks,
Richard


[PATCH, i386, AVX-512ER] vrsqrt28ps auto generation

2016-06-20 Thread Ilya Verbin
Hi!

This patch emits vrsqrt28ps instruction in ix86_emit_swsqrtsf for recip case and
vrcp28ps(vrsqrt28ps(a)) for !recip.
Regtested using various benchmarks on a AVX-512ER machine.  OK for trunk?


gcc/
* config/i386/i386.c (ix86_emit_swsqrtsf): Emit vrsqrt28ps.
* config/i386/sse.md (define_expand "rsqrtv16sf2"): New.
gcc/testsuite/
* gcc.target/i386/avx512er-vrsqrt28ps-3.c: New test.
* gcc.target/i386/avx512er-vrsqrt28ps-4.c: New test.
* gcc.target/i386/avx512er-vrsqrt28ps-5.c: New test.
* gcc.target/i386/avx512er-vrsqrt28ps-6.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8e0bf26..edd3d23 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -48722,6 +48722,24 @@ void ix86_emit_swsqrtsf (rtx res, rtx a, machine_mode 
mode, bool recip)
   e2 = gen_reg_rtx (mode);
   e3 = gen_reg_rtx (mode);
 
+  if (TARGET_AVX512ER && mode == V16SFmode)
+{
+  if (recip)
+   /* res = rsqrt28(a) estimate */
+   emit_insn (gen_rtx_SET (res, gen_rtx_UNSPEC (mode, gen_rtvec (1, a),
+UNSPEC_RSQRT28)));
+  else
+   {
+ /* x0 = rsqrt28(a) estimate */
+ emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, a),
+ UNSPEC_RSQRT28)));
+ /* res = rcp28(x0) estimate */
+ emit_insn (gen_rtx_SET (res, gen_rtx_UNSPEC (mode, gen_rtvec (1, x0),
+  UNSPEC_RCP28)));
+   }
+  return;
+}
+
   real_from_integer (, VOIDmode, -3, SIGNED);
   mthree = const_double_from_real_value (r, SFmode);
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 6056ddc..c1ea04f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1559,6 +1559,17 @@
   DONE;
 })
 
+(define_expand "rsqrtv16sf2"
+  [(set (match_operand:V16SF 0 "register_operand")
+   (unspec:V16SF
+ [(match_operand:V16SF 1 "vector_operand")]
+ UNSPEC_RSQRT28))]
+  "TARGET_SSE_MATH && TARGET_AVX512ER"
+{
+  ix86_emit_swsqrtsf (operands[0], operands[1], V16SFmode, true);
+  DONE;
+})
+
 (define_insn "_rsqrt2"
   [(set (match_operand:VF1_128_256 0 "register_operand" "=x")
(unspec:VF1_128_256
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c 
b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
new file mode 100644
index 000..1ba8172
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512er } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
+
+#include 
+#include "avx512er-check.h"
+
+#define MAX 1000
+#define EPS 0.1
+
+__attribute__ ((noinline, optimize (1)))
+void static
+compute_rsqrt_ref (float *a, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = 1.0 / sqrtf (a[i]);
+}
+
+__attribute__ ((noinline))
+void static
+compute_rsqrt_exp (float *a, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = 1.0 / sqrtf (a[i]);
+}
+
+void static
+avx512er_test (void)
+{
+  float in[MAX];
+  float ref[MAX];
+  float exp[MAX];
+
+  for (int i = 0; i < MAX; i++)
+in[i] = 8765.987 - 8.6756 * i;
+
+  compute_rsqrt_ref (in, ref);
+  compute_rsqrt_exp (in, exp);
+
+  for (int i = 0; i < MAX; i++)
+{
+  float rel_err = (ref[i] - exp[i]) / ref[i];
+  rel_err = rel_err > 0.0 ? rel_err : -rel_err;
+  if (rel_err > EPS)
+   abort ();
+}
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c 
b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c
new file mode 100644
index 000..2f5f73f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
+
+#include "avx512er-vrsqrt28ps-3.c"
+
+/* { dg-final { scan-assembler-times "vrsqrt28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ 
\\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-not "vrcp28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ 
\\t\]+#)" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c 
b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
new file mode 100644
index 000..e067a81
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512er } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
+
+#include 
+#include "avx512er-check.h"
+
+#define MAX 1000
+#define EPS 0.1
+
+__attribute__ ((noinline, optimize (1)))
+void static
+compute_sqrt_ref (float *a, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = sqrtf (a[i]);
+}
+
+__attribute__ ((noinline))
+void static
+compute_sqrt_exp (float *a, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = sqrtf (a[i]);
+}
+
+void static
+avx512er_test (void)
+{
+  float in[MAX];
+  float ref[MAX];

Re: [PATCH 0/7] remove targets obsoleted in gcc 6

2016-06-20 Thread Jeff Law

On 06/19/2016 11:47 PM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

Hi,

later than I hoped, but here's the series to remove the targets obsoleted
during gcc 6.

I built and regtested the series as one patch on x86_64-linux-gnu without
regressions, ok?

Trev


Trevor Saunders (7):
  remove support for the interix target
  remove support for targeting openbsd 2 or 3
  remove knetbsd support
  remove h8300-rtems support
  remove m32-rtems support
  remove avr-rtems support
  remove mep-* support

OK for the whole series.

hpux10.* as a host & target would seem ripe for obsoleting as well. 
Though you might check with John first.


Another "special" system we might consider obsoleting would be lynxos. 
I think those have been dead a decade or longer.  I'm not even sure who 
to ask about them anymore.


Jeff


Re: [PATCH, i386, AVX-512ER] vrsqrt28ps auto generation

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 9:35 PM, Ilya Verbin  wrote:
> Hi!
>
> This patch emits vrsqrt28ps instruction in ix86_emit_swsqrtsf for recip case 
> and
> vrcp28ps(vrsqrt28ps(a)) for !recip.
> Regtested using various benchmarks on a AVX-512ER machine.  OK for trunk?
>
>
> gcc/
> * config/i386/i386.c (ix86_emit_swsqrtsf): Emit vrsqrt28ps.
> * config/i386/sse.md (define_expand "rsqrtv16sf2"): New.
> gcc/testsuite/
> * gcc.target/i386/avx512er-vrsqrt28ps-3.c: New test.
> * gcc.target/i386/avx512er-vrsqrt28ps-4.c: New test.
> * gcc.target/i386/avx512er-vrsqrt28ps-5.c: New test.
> * gcc.target/i386/avx512er-vrsqrt28ps-6.c: New test.

OK.

Thanks,
Uros.

>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8e0bf26..edd3d23 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -48722,6 +48722,24 @@ void ix86_emit_swsqrtsf (rtx res, rtx a, 
> machine_mode mode, bool recip)
>e2 = gen_reg_rtx (mode);
>e3 = gen_reg_rtx (mode);
>
> +  if (TARGET_AVX512ER && mode == V16SFmode)
> +{
> +  if (recip)
> +   /* res = rsqrt28(a) estimate */
> +   emit_insn (gen_rtx_SET (res, gen_rtx_UNSPEC (mode, gen_rtvec (1, a),
> +UNSPEC_RSQRT28)));
> +  else
> +   {
> + /* x0 = rsqrt28(a) estimate */
> + emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, a),
> + UNSPEC_RSQRT28)));
> + /* res = rcp28(x0) estimate */
> + emit_insn (gen_rtx_SET (res, gen_rtx_UNSPEC (mode, gen_rtvec (1, 
> x0),
> +  UNSPEC_RCP28)));
> +   }
> +  return;
> +}
> +
>real_from_integer (, VOIDmode, -3, SIGNED);
>mthree = const_double_from_real_value (r, SFmode);
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 6056ddc..c1ea04f 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1559,6 +1559,17 @@
>DONE;
>  })
>
> +(define_expand "rsqrtv16sf2"
> +  [(set (match_operand:V16SF 0 "register_operand")
> +   (unspec:V16SF
> + [(match_operand:V16SF 1 "vector_operand")]
> + UNSPEC_RSQRT28))]
> +  "TARGET_SSE_MATH && TARGET_AVX512ER"
> +{
> +  ix86_emit_swsqrtsf (operands[0], operands[1], V16SFmode, true);
> +  DONE;
> +})
> +
>  (define_insn "_rsqrt2"
>[(set (match_operand:VF1_128_256 0 "register_operand" "=x")
> (unspec:VF1_128_256
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c 
> b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
> new file mode 100644
> index 000..1ba8172
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target avx512er } */
> +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
> +
> +#include 
> +#include "avx512er-check.h"
> +
> +#define MAX 1000
> +#define EPS 0.1
> +
> +__attribute__ ((noinline, optimize (1)))
> +void static
> +compute_rsqrt_ref (float *a, float *r)
> +{
> +  for (int i = 0; i < MAX; i++)
> +r[i] = 1.0 / sqrtf (a[i]);
> +}
> +
> +__attribute__ ((noinline))
> +void static
> +compute_rsqrt_exp (float *a, float *r)
> +{
> +  for (int i = 0; i < MAX; i++)
> +r[i] = 1.0 / sqrtf (a[i]);
> +}
> +
> +void static
> +avx512er_test (void)
> +{
> +  float in[MAX];
> +  float ref[MAX];
> +  float exp[MAX];
> +
> +  for (int i = 0; i < MAX; i++)
> +in[i] = 8765.987 - 8.6756 * i;
> +
> +  compute_rsqrt_ref (in, ref);
> +  compute_rsqrt_exp (in, exp);
> +
> +  for (int i = 0; i < MAX; i++)
> +{
> +  float rel_err = (ref[i] - exp[i]) / ref[i];
> +  rel_err = rel_err > 0.0 ? rel_err : -rel_err;
> +  if (rel_err > EPS)
> +   abort ();
> +}
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c 
> b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c
> new file mode 100644
> index 000..2f5f73f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-4.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
> +
> +#include "avx512er-vrsqrt28ps-3.c"
> +
> +/* { dg-final { scan-assembler-times 
> "vrsqrt28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
> +/* { dg-final { scan-assembler-not "vrcp28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ 
> \\t\]+#)" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c 
> b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
> new file mode 100644
> index 000..e067a81
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target avx512er } */
> +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
> +
> +#include 
> +#include "avx512er-check.h"
> +
> +#define MAX 1000
> +#define 

Re: [PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu  wrote:
> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak  wrote:
>> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu  wrote:
>>> Hi,
>>>
>>> This patch implements the alternate code sequence recommended in
>>>
>>> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>>>
>>> to load external function address via GOT slot with
>>>
>>> movq func@GOTPCREL(%rip), %rax
>>>
>>> so that linker won't create an PLT entry for extern function
>>> address.
>>>
>>> Tested on x86-64.  OK for trunk?
>>
>>> +  else if (ix86_force_load_from_GOT_p (op1))
>>> +{
>>> +  /* Load the external function address via the GOT slot to
>>> +avoid PLT.  */
>>> +  op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
>>> +   (TARGET_64BIT
>>> +? UNSPEC_GOTPCREL
>>> +: UNSPEC_GOT));
>>> +  op1 = gen_rtx_CONST (Pmode, op1);
>>> +  op1 = gen_const_mem (Pmode, op1);
>>> +  /* This symbol must be referenced via a load from the Global
>>> +Offset Table.  */
>>> +  set_mem_alias_set (op1, ix86_GOT_alias_set ());
>>> +  op1 = convert_to_mode (mode, op1, 1);
>>> +  op1 = force_reg (mode, op1);
>>> +  emit_insn (gen_rtx_SET (op0, op1));
>>> +  /* Generate a CLOBBER so that there will be no REG_EQUAL note
>>> +on the last insn to prevent cse and fwprop from replacing
>>> +a GOT load with a constant.  */
>>> +  rtx tmp = gen_reg_rtx (Pmode);
>>> +  emit_clobber (tmp);
>>> +  return;
>>
>> Jeff, is this the recommended way to prevent CSE, as far as RTL
>> infrastructure is concerned? I didn't find any example of this
>> approach with other targets.
>>
>
> FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
> as well as other targets:
>
> frv/frv.c:  emit_clobber (op0);
> frv/frv.c:  emit_clobber (op1);
> im32c/m32c.c:  /*  emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
> s390/s390.c:  emit_clobber (addr);
> s390/s390.md:  emit_clobber (reg0);
> s390/s390.md:  emit_clobber (reg1);
> s390/s390.md:  emit_clobber (reg0);
> s390/s390.md:  emit_clobber (reg0);
> s390/s390.md:  emit_clobber (reg1);

These usages mark the whole register as being "clobbered"
(=undefined), before only a part of register is written, e.g.:

  emit_clobber (int_xmm);
  emit_move_insn (gen_lowpart (DImode, int_xmm), input);

They aren't used to prevent unwanted CSE.

Uros.


Re: [PATCH] config-list.mk AIX update

2016-06-20 Thread Jeff Law

On 06/20/2016 12:13 PM, David Edelsohn wrote:

This patch removes obsolete AIX 4.3, 5.1 and 5.2 configurations and
adds AIX 7.1 configuration.  GCC does not yet differentiate AIX 7.2,
so I did not include it.

Okay?

Thanks, David

* config-list.mk: Remove rs6000-ibm-aix4.3, rs6000-ibm-aix5.1,
rs6000-ibm-aix5.2.
Rename rs6000-ibm-aix6.0 as rs6000-ibm-aix6.1.  Add rs6000-ibm-aix7.1.
I'd think this falls under aix & ppc maintainership, so you can/should 
self-approve ;-)


jeff



Re: [PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread H.J. Lu
On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak  wrote:
> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu  wrote:
>> Hi,
>>
>> This patch implements the alternate code sequence recommended in
>>
>> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>>
>> to load external function address via GOT slot with
>>
>> movq func@GOTPCREL(%rip), %rax
>>
>> so that linker won't create an PLT entry for extern function
>> address.
>>
>> Tested on x86-64.  OK for trunk?
>
>> +  else if (ix86_force_load_from_GOT_p (op1))
>> +{
>> +  /* Load the external function address via the GOT slot to
>> +avoid PLT.  */
>> +  op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
>> +   (TARGET_64BIT
>> +? UNSPEC_GOTPCREL
>> +: UNSPEC_GOT));
>> +  op1 = gen_rtx_CONST (Pmode, op1);
>> +  op1 = gen_const_mem (Pmode, op1);
>> +  /* This symbol must be referenced via a load from the Global
>> +Offset Table.  */
>> +  set_mem_alias_set (op1, ix86_GOT_alias_set ());
>> +  op1 = convert_to_mode (mode, op1, 1);
>> +  op1 = force_reg (mode, op1);
>> +  emit_insn (gen_rtx_SET (op0, op1));
>> +  /* Generate a CLOBBER so that there will be no REG_EQUAL note
>> +on the last insn to prevent cse and fwprop from replacing
>> +a GOT load with a constant.  */
>> +  rtx tmp = gen_reg_rtx (Pmode);
>> +  emit_clobber (tmp);
>> +  return;
>
> Jeff, is this the recommended way to prevent CSE, as far as RTL
> infrastructure is concerned? I didn't find any example of this
> approach with other targets.
>

FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
as well as other targets:

frv/frv.c:  emit_clobber (op0);
frv/frv.c:  emit_clobber (op1);
im32c/m32c.c:  /*  emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
s390/s390.c:  emit_clobber (addr);
s390/s390.md:  emit_clobber (reg0);
s390/s390.md:  emit_clobber (reg1);
s390/s390.md:  emit_clobber (reg0);
s390/s390.md:  emit_clobber (reg0);
s390/s390.md:  emit_clobber (reg1);


-- 
H.J.


Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-06-20 Thread Martin Sebor

+  if (TREE_CODE (whole) == INDIRECT_REF
+  && integer_zerop (TREE_OPERAND (whole, 0))
+  && !ctx->quiet)
+error ("dereferencing a null pointer in %qE", orig_whole);



+  if (TREE_CODE (t) == INTEGER_CST
+  && TREE_CODE (TREE_TYPE (t)) == POINTER_TYPE
+  && !integer_zerop (t))
+{
+  if (!ctx->quiet)
+error ("arithmetic involving a null pointer in %qE", t);
+}


These places should all set *non_constant_p, and the second should
return t after doing so.  OK with that change.


Some additional testing exposed a couple of bugs in the patch.
First, adding or subtracting a constant zero to or from a null
pointer is valid (it was rejected with the previous patch).
Second, the code in cxx_eval_constant_expression that tried
to detect invalid conversions didn't handle qualifiers
correctly in all cases, causing conversions that add constness
to be rejected.  Finally, while fixing these issues I decided
that dereferencing and indirecting through null pointers would
be more appropriately handled in the functions that deal with
those expressions rather than in cxx_eval_constant_expression.

Since the updates aren't completely trivial I post the new
version of the patch for another review before committing it.

Thanks
Martin

PR c++/60760 - arithmetic on null pointers should not be allowed in constant
  expressions
PR c++/71091 - constexpr reference bound to a null pointer dereference
   accepted

gcc/cp/ChangeLog:
2016-06-20  Martin Sebor  

	PR c++/60760
	PR c++/71091
	* constexpr.c (cxx_eval_binary_expression): Reject invalid expressions
	involving null pointers.
	(cxx_eval_component_reference): Reject null pointer dereferences.
	(cxx_eval_indirect_ref): Reject indirecting through null pointers.
	(cxx_eval_constant_expression): Reject invalid expressions involving
	null pointers.

gcc/testsuite/ChangeLog:
2016-06-20  Martin Sebor  

	PR c++/60760
	PR c++/71091
	* g++.dg/cpp0x/constexpr-nullptr-2.C: New test.
	* g++.dg/cpp1y/constexpr-sfinae.C: Correct.
	* g++.dg/ubsan/pr63956.C: Correct.

Index: gcc/cp/constexpr.c
===
--- gcc/cp/constexpr.c	(revision 237582)
+++ gcc/cp/constexpr.c	(working copy)
@@ -1811,6 +1811,14 @@ cxx_eval_binary_expression (const constexpr_ctx *c
 		   || null_member_pointer_value_p (rhs)))
 	r = constant_boolean_node (!is_code_eq, type);
 }
+  if (code == POINTER_PLUS_EXPR && !*non_constant_p
+  && tree_int_cst_equal (lhs, null_pointer_node)
+  && !tree_int_cst_equal (rhs, integer_zero_node))
+{
+  if (!ctx->quiet)
+error ("arithmetic involving a null pointer in %qE", lhs);
+  return t;
+}
 
   if (r == NULL_TREE)
 r = fold_binary_loc (loc, code, type, lhs, rhs);
@@ -2151,6 +2159,11 @@ cxx_eval_component_reference (const constexpr_ctx
   tree whole = cxx_eval_constant_expression (ctx, orig_whole,
 	 lval,
 	 non_constant_p, overflow_p);
+  if (TREE_CODE (whole) == INDIRECT_REF
+  && integer_zerop (TREE_OPERAND (whole, 0))
+  && !ctx->quiet)
+error ("dereferencing a null pointer in %qE", orig_whole);
+
   if (TREE_CODE (whole) == PTRMEM_CST)
 whole = cplus_expand_constant (whole);
   if (whole == orig_whole)
@@ -2911,6 +2924,14 @@ cxx_eval_indirect_ref (const constexpr_ctx *ctx, t
   if (*non_constant_p)
 	return t;
 
+  if (integer_zerop (op0))
+	{
+	  if (!ctx->quiet)
+	error ("dereferencing a null pointer");
+	  *non_constant_p = true;
+	  return t;
+	}
+
   r = cxx_fold_indirect_ref (EXPR_LOCATION (t), TREE_TYPE (t), op0,
  _base);
   if (r == NULL_TREE)
@@ -3559,10 +3580,20 @@ cxx_eval_constant_expression (const constexpr_ctx
 	  if (!flag_permissive || ctx->quiet)
 	*overflow_p = true;
 	}
+
+  if (TREE_CODE (t) == INTEGER_CST
+  && TREE_CODE (TREE_TYPE (t)) == POINTER_TYPE
+  && !integer_zerop (t))
+{
+  if (!ctx->quiet)
+error ("arithmetic involving a null pointer in %qE", t);
+}
+
   return t;
 }
 
-  switch (TREE_CODE (t))
+  tree_code tcode = TREE_CODE (t);
+  switch (tcode)
 {
 case RESULT_DECL:
   if (lval)
@@ -3973,7 +4004,6 @@ cxx_eval_constant_expression (const constexpr_ctx
 case NOP_EXPR:
 case UNARY_PLUS_EXPR:
   {
-	enum tree_code tcode = TREE_CODE (t);
 	tree oldop = TREE_OPERAND (t, 0);
 
 	tree op = cxx_eval_constant_expression (ctx, oldop,
@@ -3999,15 +4029,43 @@ cxx_eval_constant_expression (const constexpr_ctx
 		return t;
 	  }
 	  }
-	if (POINTER_TYPE_P (type)
-	&& TREE_CODE (op) == INTEGER_CST
-	&& !integer_zerop (op))
-	  {
-	if (!ctx->quiet)
-	  error_at (EXPR_LOC_OR_LOC (t, input_location),
-			"reinterpret_cast from integer to pointer");
-	*non_constant_p = true;
-	return t;
+	if (POINTER_TYPE_P (type) && TREE_CODE (op) == INTEGER_CST)
+  {
+	if (integer_zerop (op))
+	  

Re: [PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu  wrote:
> Hi,
>
> This patch implements the alternate code sequence recommended in
>
> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>
> to load external function address via GOT slot with
>
> movq func@GOTPCREL(%rip), %rax
>
> so that linker won't create an PLT entry for extern function
> address.
>
> Tested on x86-64.  OK for trunk?

> +  else if (ix86_force_load_from_GOT_p (op1))
> +{
> +  /* Load the external function address via the GOT slot to
> +avoid PLT.  */
> +  op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
> +   (TARGET_64BIT
> +? UNSPEC_GOTPCREL
> +: UNSPEC_GOT));
> +  op1 = gen_rtx_CONST (Pmode, op1);
> +  op1 = gen_const_mem (Pmode, op1);
> +  /* This symbol must be referenced via a load from the Global
> +Offset Table.  */
> +  set_mem_alias_set (op1, ix86_GOT_alias_set ());
> +  op1 = convert_to_mode (mode, op1, 1);
> +  op1 = force_reg (mode, op1);
> +  emit_insn (gen_rtx_SET (op0, op1));
> +  /* Generate a CLOBBER so that there will be no REG_EQUAL note
> +on the last insn to prevent cse and fwprop from replacing
> +a GOT load with a constant.  */
> +  rtx tmp = gen_reg_rtx (Pmode);
> +  emit_clobber (tmp);
> +  return;

Jeff, is this the recommended way to prevent CSE, as far as RTL
infrastructure is concerned? I didn't find any example of this
approach with other targets.

Uros.


Re: [PATCH] Fix ix86_fp_cmp_code_to_pcmp_immediate (PR target/71559)

2016-06-20 Thread Jakub Jelinek
On Mon, Jun 20, 2016 at 09:04:26PM +0200, Uros Bizjak wrote:
> OK for mainline and release branches after a week or so without
> problems in mainline.

Ok, thanks.

> (I tried to review usage of all those bits, LGTM, but mistakes can happen...)

I really hope the testcase should double check all that,
ix86_fp_cmp_code_to_pcmp_immediate is run with it on all the listed codes
except LTGT, and the expected results were written independently from the
tables and are also checked with SSE2 and AVX implementations.
When I've limited the testcase to the functions with only
EQ/NE/GT/LE/GE/LT, the testcase doesn't ICE, but aborted under the AVX512F
emulator.

Jakub


Re: [PATCH, i386, AVX-512ER] vrcp28ps auto generation

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 7:09 PM, Ilya Verbin  wrote:
> Hi!
>
> This patch emits vrcp28ps and vmulps istructions for ix86_emit_swdivsf.
> The relative error is < 2^-23, so no additional iteration is necessary.
> Regtested using various benchmarks on a AVX-512ER machine.  OK for trunk?
>
>
> gcc/
> * config/i386/i386.c (ix86_emit_swdivsf): Emit vrcp28ps.
> gcc/testsuite/
> * gcc.target/i386/avx512er-vrcp28ps-3.c: New test.
> * gcc.target/i386/avx512er-vrcp28ps-4.c: New test.

OK.

Thanks,
Uros.

>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 56a5b9c..8e0bf26 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -48674,8 +48674,19 @@ void ix86_emit_swdivsf (rtx res, rtx a, rtx b, 
> machine_mode mode)
>
>/* x0 = rcp(b) estimate */
>if (mode == V16SFmode || mode == V8DFmode)
> -emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
> -   UNSPEC_RCP14)));
> +{
> +  if (TARGET_AVX512ER)
> +   {
> + emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
> + UNSPEC_RCP28)));
> + /* res = a * x0 */
> + emit_insn (gen_rtx_SET (res, gen_rtx_MULT (mode, a, x0)));
> + return;
> +   }
> +  else
> +   emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
> +   UNSPEC_RCP14)));
> +}
>else
>  emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
> UNSPEC_RCP)));
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c 
> b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c
> new file mode 100644
> index 000..e08bea4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c
> @@ -0,0 +1,50 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target avx512er } */
> +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
> +
> +#include "avx512er-check.h"
> +
> +#define MAX 1000
> +#define EPS 0.1
> +
> +__attribute__ ((noinline, optimize (0)))
> +void static
> +compute_rcp_ref (float *a, float *b, float *r)
> +{
> +  for (int i = 0; i < MAX; i++)
> +r[i] = a[i] / b[i];
> +}
> +
> +__attribute__ ((noinline))
> +void static
> +compute_rcp_exp (float *a, float *b, float *r)
> +{
> +  for (int i = 0; i < MAX; i++)
> +r[i] = a[i] / b[i];
> +}
> +
> +void static
> +avx512er_test (void)
> +{
> +  float a[MAX];
> +  float b[MAX];
> +  float ref[MAX];
> +  float exp[MAX];
> +
> +  for (int i = 0; i < MAX; i++)
> +{
> +  a[i] = 179.345 - 6.5645 * i;
> +  b[i] = 8765.987 - 8.6756 * i;
> +}
> +
> +  compute_rcp_ref (a, b, ref);
> +  compute_rcp_exp (a, b, exp);
> +
> +  for (int i = 0; i < MAX; i++)
> +{
> +  float rel_err = (ref[i] - exp[i]) / ref[i];
> +  rel_err = rel_err > 0.0 ? rel_err : -rel_err;
> +  if (rel_err > EPS)
> +   abort ();
> +}
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c 
> b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c
> new file mode 100644
> index 000..2c76d96
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
> +
> +#include "avx512er-vrcp28ps-3.c"
> +
> +/* { dg-final { scan-assembler-times "vrcp28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ 
> \\t\]+#)" 1 } } */
>
>
>   -- Ilya


Re: [PATCH] Fix ix86_fp_cmp_code_to_pcmp_immediate (PR target/71559)

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 8:31 PM, Jakub Jelinek  wrote:
> Hi!
>
> As discussed in the PR, this function is missing a lot of comparison codes
> that can validly appear there, and gives wrong values for the others
> except for NE.
> This patch makes those values match what %D3 emits for the AVX vcmp*p{s,d},
> there is some controversy on whether UN{GT,GE,LT,LE,EQ} and/or LTGT should
> raise exceptions or not, but that should be handled later on also together
> with the scalar code (where we never raise exceptions), SSE, AVX and this.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?
>
> 2016-06-20  Jakub Jelinek  
>
> PR target/71559
> * config/i386/i386.c (ix86_fp_cmp_code_to_pcmp_immediate): Fix up
> returned values and add UN*/LTGT/*ORDERED cases with values matching
> D operand modifier on vcmp for AVX.
>
> * gcc.target/i386/sse2-pr71559.c: New test.
> * gcc.target/i386/avx-pr71559.c: New test.
> * gcc.target/i386/avx512f-pr71559.c: New test.

OK for mainline and release branches after a week or so without
problems in mainline.

(I tried to review usage of all those bits, LGTM, but mistakes can happen...)

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-06-20 10:36:29.489994876 +0200
> +++ gcc/config/i386/i386.c  2016-06-20 12:07:37.311006144 +0200
> @@ -23622,17 +23622,33 @@ ix86_fp_cmp_code_to_pcmp_immediate (enum
>switch (code)
>  {
>  case EQ:
> -  return 0x08;
> +  return 0x00;
>  case NE:
>return 0x04;
>  case GT:
> -  return 0x16;
> +  return 0x0e;
>  case LE:
> -  return 0x1a;
> +  return 0x02;
>  case GE:
> -  return 0x15;
> +  return 0x0d;
>  case LT:
> -  return 0x19;
> +  return 0x01;
> +case UNLE:
> +  return 0x0a;
> +case UNLT:
> +  return 0x09;
> +case UNGE:
> +  return 0x05;
> +case UNGT:
> +  return 0x06;
> +case UNEQ:
> +  return 0x18;
> +case LTGT:
> +  return 0x0c;
> +case ORDERED:
> +  return 0x07;
> +case UNORDERED:
> +  return 0x03;
>  default:
>gcc_unreachable ();
>  }
> --- gcc/testsuite/gcc.target/i386/sse2-pr71559.c.jj 2016-06-20 
> 12:10:27.621795187 +0200
> +++ gcc/testsuite/gcc.target/i386/sse2-pr71559.c2016-06-20 
> 12:14:44.821457893 +0200
> @@ -0,0 +1,73 @@
> +/* PR target/71559 */
> +/* { dg-do run { target sse2 } } */
> +/* { dg-options "-O2 -ftree-vectorize -msse2" } */
> +
> +#ifndef PR71559_TEST
> +#include "sse2-check.h"
> +#define PR71559_TEST sse2_test
> +#endif
> +
> +#define N 16
> +float a[N] = { 5.0f, -3.0f, 1.0f, __builtin_nanf (""), 9.0f, 7.0f, -3.0f, 
> -9.0f,
> +   -3.0f, -5.0f, -9.0f, __builtin_nanf (""), 0.5f, -0.5f, 0.0f, 
> 0.0f };
> +float b[N] = { -5.0f, 3.0f, 1.0f, 7.0f, 8.0f, 8.0f, -3.0f, __builtin_nanf 
> (""),
> +   -4.0f, -4.0f, -9.0f, __builtin_nanf (""), 0.0f, 0.0f, 0.0f, 
> __builtin_nanf ("") };
> +int c[N], d[N];
> +
> +#define FN(name, op) \
> +void   \
> +name (void)\
> +{  \
> +  int i;   \
> +  for (i = 0; i < N; i++)  \
> +c[i] = (op || d[i] > 37) ? 5 : 32; \
> +}
> +FN (eq, a[i] == b[i])
> +FN (ne, a[i] != b[i])
> +FN (gt, a[i] > b[i])
> +FN (ge, a[i] >= b[i])
> +FN (lt, a[i] < b[i])
> +FN (le, a[i] <= b[i])
> +FN (unle, !__builtin_isgreater (a[i], b[i]))
> +FN (unlt, !__builtin_isgreaterequal (a[i], b[i]))
> +FN (unge, !__builtin_isless (a[i], b[i]))
> +FN (ungt, !__builtin_islessequal (a[i], b[i]))
> +FN (uneq, !__builtin_islessgreater (a[i], b[i]))
> +FN (ordered, !__builtin_isunordered (a[i], b[i]))
> +FN (unordered, __builtin_isunordered (a[i], b[i]))
> +
> +#define TEST(name, GT, LT, EQ, UO) \
> +  name (); \
> +  for (i = 0; i < N; i++)  \
> +{  \
> +  int v;   \
> +  switch (i % 4)   \
> +   {   \
> +   case 0: v = GT ? 5 : 32; break; \
> +   case 1: v = LT ? 5 : 32; break; \
> +   case 2: v = EQ ? 5 : 32; break; \
> +   case 3: v = UO ? 5 : 32; break; \
> +   }   \
> +  if (c[i] != v)   \
> +   __builtin_abort (); \
> +}
> +
> +void
> +PR71559_TEST (void)
> +{
> +  int i;
> +  asm volatile ("" : : "g" (a), "g" (b), "g" (c), "g" (d) : "memory");
> +  TEST (eq, 0, 0, 1, 0)
> +  TEST (ne, 1, 1, 0, 1)
> +  TEST (gt, 1, 0, 0, 0)
> +  TEST (ge, 1, 0, 1, 0)
> +  TEST (lt, 0, 1, 0, 0)
> +  TEST (le, 0, 1, 1, 0)
> +  TEST (unle, 0, 1, 1, 1)
> +  TEST (unlt, 0, 1, 0, 1)
> +  TEST (unge, 1, 0, 1, 1)
> +  TEST (ungt, 1, 0, 0, 1)
> +  TEST (uneq, 0, 0, 1, 1)
> +  TEST (ordered, 1, 1, 1, 0)
> +  TEST (unordered, 0, 0, 0, 1)
> +}
> --- 

Re: [PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread H.J. Lu
On Mon, Jun 20, 2016 at 10:31 AM, Ilya Enkovich  wrote:
> On 20 Jun 09:45, H.J. Lu wrote:
>> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich  
>> wrote:
>> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak :
>> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu  wrote:
>> >>> TImode register referenced in debug insn can be converted to V1TImode
>> >>> by scalar to vector optimization.  We need to convert a debug insn if
>> >>> it has a variable in a TImode register.
>> >>>
>> >>> Tested on x86-64.  OK for trunk?
>> >>
>> >> Ilya, does this approach look good to you? Also, does DImode STV
>> >> conversion need similar handling of debug insns?
>> >
>> > DImode conversion doesn't change register mode (i.e. never calls
>> > PUT_MODE for registers).  That keeps debug instructions valid.
>> >
>> > Overall I don't like the idea of having debug insns in candidates
>> > set and in chains.  Looks like it is possible to have a chain
>> > consisting of a debug insn only which is weird (otherwise I don't
>> > see why we may return false in timode_scalar_chain::convert_insn).
>>
>> Yes, it can happen:
>>
>> (insn 11 8 12 2 (parallel [
>> (set (reg/v:TI 91 [  ])
>> (plus:TI (reg/v:TI 92 [ a ])
>> (reg/v:TI 96 [ b ])))
>> (clobber (reg:CC 17 flags))
>> ]) y.i:5 210 {*addti3_doubleword}
>>  (expr_list:REG_UNUSED (reg:CC 17 flags)
>> (nil)))
>> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [  ])) y.i:5 
>> -1
>>  (nil))
>>
>>
>> > What about other possible register uses?  If debug insns are added
>> > to candidates then NONDEBUG_INSN_P check for uses in
>> > timode_check_non_convertible_regs becomes invalid, right?
>>
>> Debug insn has no impact on STV decision.  We just need to convert
>> register referenced in debug insn from V1TImode to TImode in
>> timode_scalar_chain::convert_insn.
>>
>> > If we have (or want) to fix some register uses then it's probably
>> > would be better to visit register uses when we convert its mode
>> > and make required fix-ups.  It seems better to me to not involve
>> > debug insns in analysis phase.
>>
>> Here is the updated patch to add debug insn, which references the
>> TImode register which will be converted to V1TImode to queue.
>> I am testing it now.
>>
>
> You still count and dump debug insns as optimized ones.  Also we
> try to use virtual functions to cover differences in DI and TI
> optimizations and introducing additional TARGET_64BIT in common
> STV code is undesirable.
>
> Also your conversion now depends on instructions processing order.
> You will fail to process debug insn before non-debug ones. Required
> order is not guaranteed because processing depends on instruction
> UIDs only.
>
> I propose to modify transformation phase only like in the patch
> (untested) below.  I rely on your code which assumes the only
> possible usage in debug insn is VAR_LOCATION.
>
> Thanks,
> Ilya
> --
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index c5e5e12..ec955f0 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain
>
>   private:
>void mark_dual_mode_def (df_ref def);
> +  void fix_debug_reg_uses (rtx reg);
>void convert_insn (rtx_insn *insn);
>/* We don't convert registers to difference size.  */
>void convert_registers () {}
> @@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
>df_insn_rescan (insn);
>  }
>
> +/* Fix uses of converted REG in debug insns.  */
> +
> +void
> +timode_scalar_chain::fix_debug_reg_uses (rtx reg)
> +{
> +  df_ref ref;
> +  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG 
> (ref))
> +{
> +  rtx_insn *insn = DF_REF_INSN (ref);
> +
> +  if (DEBUG_INSN_P (insn))
> +   {
> + /* It must be a debug insn with a TImode variable in register.  */
> + rtx val = PATTERN (insn);
> + gcc_assert (GET_MODE (val) == TImode
> + && GET_CODE (val) == VAR_LOCATION);
> + rtx loc = PAT_VAR_LOCATION_LOC (val);
> + gcc_assert (REG_P (loc)
> + && GET_MODE (loc) == V1TImode);
> + /* Convert V1TImode register, which has been updated by a SET
> + insn before, to SUBREG TImode.  */
> + PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
> + df_insn_rescan (insn);
> + return;
> +   }
> +}
> +}
> +
>  /* Convert INSN from TImode to V1T1mode.  */
>
>  void
> @@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)
> rtx tmp = find_reg_equal_equiv_note (insn);
> if (tmp)
>   PUT_MODE (XEXP (tmp, 0), V1TImode);
> +   PUT_MODE (dst, V1TImode);
> +   fix_debug_reg_uses (dst);
>}
> -  /* FALLTHRU */
> +  break;
>  case MEM:
>PUT_MODE 

[PATCH] Fix warn uninit ICE with _Complex exprs (PR middle-end/71581)

2016-06-20 Thread Jakub Jelinek
Hi!

On the following testcase we ICE during warn_uninit.
Normally, has_undefined_value_p returns false for anonymous SSA_NAMEs,
so NULL expr/var aren't a problem.  If t has a COMPLEX_EXPR as def-stmt,
where the first operand is some scalar var's (D) SSA_NAME and the second
operand is 0, which results from gimplification of conversion of scalar
uninitialized var to complex type, t is anonymous SSA_NAME and expr and var
are both NULL.

This patch attempts to deal with that, try to recognize the case and use
the other SSA_NAME's underlying var as expr/var in that case, or punt (in
the unlikely case this wouldn't help).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-06-20  Jakub Jelinek  

PR middle-end/71581
* tree-ssa-uninit.c (warn_uninit): If EXPR and VAR are NULL,
see if T isn't anonymous SSA_NAME with COMPLEX_EXPR created
for conversion of scalar user var to complex type and use the
underlying SSA_NAME_VAR in that case.  If EXPR is still NULL,
punt.

* gcc.dg/pr71581.c: New test.

--- gcc/tree-ssa-uninit.c.jj2016-05-06 15:09:09.0 +0200
+++ gcc/tree-ssa-uninit.c   2016-06-20 16:13:31.324052992 +0200
@@ -131,6 +131,29 @@ warn_uninit (enum opt_code wc, tree t, t
   if (!has_undefined_value_p (t))
 return;
 
+  /* Anonymous SSA_NAMEs shouldn't be uninitialized, but ssa_undefined_value_p
+ can return true if the def stmt of anonymous SSA_NAME is COMPLEX_EXPR
+ created for conversion from scalar to complex.  Use the underlying var of
+ the COMPLEX_EXPRs real part in that case.  See PR71581.  */
+  if (expr == NULL_TREE
+  && var == NULL_TREE
+  && SSA_NAME_VAR (t) == NULL_TREE
+  && is_gimple_assign (SSA_NAME_DEF_STMT (t))
+  && gimple_assign_rhs_code (SSA_NAME_DEF_STMT (t)) == COMPLEX_EXPR)
+{
+  tree v = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t));
+  if (TREE_CODE (v) == SSA_NAME
+ && has_undefined_value_p (v)
+ && zerop (gimple_assign_rhs2 (SSA_NAME_DEF_STMT (t
+   {
+ expr = SSA_NAME_VAR (v);
+ var = expr;
+   }
+}
+
+  if (expr == NULL_TREE)
+return;
+
   /* TREE_NO_WARNING either means we already warned, or the front end
  wishes to suppress the warning.  */
   if ((context
--- gcc/testsuite/gcc.dg/pr71581.c.jj   2016-06-20 16:39:18.825817407 +0200
+++ gcc/testsuite/gcc.dg/pr71581.c  2016-06-20 16:28:49.0 +0200
@@ -0,0 +1,24 @@
+/* PR middle-end/71581 */
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized" } */
+
+_Complex float
+f1 (void)
+{
+  float x;
+  return x;/* { dg-warning "is used uninitialized in this function" } */
+}
+
+_Complex double
+f2 (void)
+{
+  double x;
+  return x;/* { dg-warning "is used uninitialized in this function" } */
+}
+
+_Complex int
+f3 (void)
+{
+  int x;
+  return x;/* { dg-warning "is used uninitialized in this function" } */
+}

Jakub


[PATCH] Don't run -fself-test with -E (PR rtl-optimization/71591)

2016-06-20 Thread Jakub Jelinek
Hi!

As mentioned in the PR, with -E (C family and Fortran FEs, others don't
preprocess) ask the middle-end not to initialize the backends, so running
e.g. RTL tests leads to ICEs, e.g. pc_rtx and many other things just aren't
initialized.

2016-06-20  Jakub Jelinek  

PR rtl-optimization/71591
* toplev.c (toplev::run_self_tests): If no_backend, complain and
don't run any tests.

* gcc.dg/cpp/pr71591.c: New test.

--- gcc/toplev.c.jj 2016-06-13 20:45:11.0 +0200
+++ gcc/toplev.c2016-06-20 14:54:07.931667136 +0200
@@ -2047,6 +2047,11 @@ toplev::start_timevars ()
 void
 toplev::run_self_tests ()
 {
+  if (no_backend)
+{
+  error_at (UNKNOWN_LOCATION, "self-tests incompatible with -E");
+  return;
+}
 #if CHECKING_P
   /* Reset some state.  */
   input_location = UNKNOWN_LOCATION;
--- gcc/testsuite/gcc.dg/cpp/pr71591.c.jj   2016-06-20 14:57:19.713187492 
+0200
+++ gcc/testsuite/gcc.dg/cpp/pr71591.c  2016-06-20 14:58:46.014071662 +0200
@@ -0,0 +1,5 @@
+/* PR rtl-optimization/71591 */
+/* { dg-do preprocess } */
+/* { dg-options "-fself-test" } */
+
+/* { dg-message "self-tests incompatible with -E" "" { target *-*-* } 0 } */

Jakub


[PATCH] Give up instead of ICE on invalid stringops attributes (PR tree-optimization/71588)

2016-06-20 Thread Jakub Jelinek
Hi!

If users use attributes like const or pure incorrectly on stringops
builtins, the tree-ssa-strlen.c pass can ICE, because it expects it can e.g.
replace a strcpy (which should not be const or pure) with memcpy (which also
shouldn't be const/pure) etc.
The patch just pretends the calls aren't builtins for the purpose of
tree-ssa-strlen.c pass if they have unexpected const/pure-ness.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-06-20  Jakub Jelinek  

PR tree-optimization/71588
* tree-ssa-strlen.c (valid_builtin_call): New function.
(adjust_last_stmt, handle_builtin_memset, strlen_optimize_stmt): Use
it.

* gcc.dg/pr71558.c: New test.

--- gcc/tree-ssa-strlen.c.jj2016-06-08 14:51:25.0 +0200
+++ gcc/tree-ssa-strlen.c   2016-06-20 13:30:23.576556803 +0200
@@ -860,6 +860,66 @@ find_equal_ptrs (tree ptr, int idx)
 }
 }
 
+/* Return true if STMT is a call to a builtin function with the right
+   arguments and attributes that should be considered for optimization
+   by this pass.  */
+
+static bool
+valid_builtin_call (gimple *stmt)
+{
+  if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
+return false;
+
+  tree callee = gimple_call_fndecl (stmt);
+  switch (DECL_FUNCTION_CODE (callee))
+{
+case BUILT_IN_MEMCMP:
+case BUILT_IN_MEMCMP_EQ:
+case BUILT_IN_STRCHR:
+case BUILT_IN_STRCHR_CHKP:
+case BUILT_IN_STRLEN:
+case BUILT_IN_STRLEN_CHKP:
+  /* The above functions should be pure.  Punt if they aren't.  */
+  if (gimple_vdef (stmt) || gimple_vuse (stmt) == NULL_TREE)
+   return false;
+  break;
+
+case BUILT_IN_CALLOC:
+case BUILT_IN_MALLOC:
+case BUILT_IN_MEMCPY:
+case BUILT_IN_MEMCPY_CHK:
+case BUILT_IN_MEMCPY_CHKP:
+case BUILT_IN_MEMCPY_CHK_CHKP:
+case BUILT_IN_MEMPCPY:
+case BUILT_IN_MEMPCPY_CHK:
+case BUILT_IN_MEMPCPY_CHKP:
+case BUILT_IN_MEMPCPY_CHK_CHKP:
+case BUILT_IN_MEMSET:
+case BUILT_IN_STPCPY:
+case BUILT_IN_STPCPY_CHK:
+case BUILT_IN_STPCPY_CHKP:
+case BUILT_IN_STPCPY_CHK_CHKP:
+case BUILT_IN_STRCAT:
+case BUILT_IN_STRCAT_CHK:
+case BUILT_IN_STRCAT_CHKP:
+case BUILT_IN_STRCAT_CHK_CHKP:
+case BUILT_IN_STRCPY:
+case BUILT_IN_STRCPY_CHK:
+case BUILT_IN_STRCPY_CHKP:
+case BUILT_IN_STRCPY_CHK_CHKP:
+  /* The above functions should be neither const nor pure.  Punt if they
+aren't.  */
+  if (gimple_vdef (stmt) == NULL_TREE || gimple_vuse (stmt) == NULL_TREE)
+   return false;
+  break;
+
+default:
+  break;
+}
+
+  return true;
+}
+
 /* If the last .MEM setter statement before STMT is
memcpy (x, y, strlen (y) + 1), the only .MEM use of it is STMT
and STMT is known to overwrite x[strlen (x)], adjust the last memcpy to
@@ -935,7 +995,7 @@ adjust_last_stmt (strinfo *si, gimple *s
   return;
 }
 
-  if (!gimple_call_builtin_p (last.stmt, BUILT_IN_NORMAL))
+  if (!valid_builtin_call (last.stmt))
 return;
 
   callee = gimple_call_fndecl (last.stmt);
@@ -1811,7 +1871,7 @@ handle_builtin_memset (gimple_stmt_itera
   if (!stmt1 || !is_gimple_call (stmt1))
 return true;
   tree callee1 = gimple_call_fndecl (stmt1);
-  if (!gimple_call_builtin_p (stmt1, BUILT_IN_NORMAL))
+  if (!valid_builtin_call (stmt1))
 return true;
   enum built_in_function code1 = DECL_FUNCTION_CODE (callee1);
   tree size = gimple_call_arg (stmt2, 2);
@@ -2140,7 +2200,7 @@ strlen_optimize_stmt (gimple_stmt_iterat
   if (is_gimple_call (stmt))
 {
   tree callee = gimple_call_fndecl (stmt);
-  if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
+  if (valid_builtin_call (stmt))
switch (DECL_FUNCTION_CODE (callee))
  {
  case BUILT_IN_STRLEN:
--- gcc/testsuite/gcc.dg/pr71558.c.jj   2016-06-20 13:52:15.491591442 +0200
+++ gcc/testsuite/gcc.dg/pr71558.c  2016-06-20 13:51:59.0 +0200
@@ -0,0 +1,17 @@
+/* PR tree-optimization/71588 */
+
+/* strcpy must not be pure, but make sure we don't ICE even when
+   it is declared incorrectly.  */
+char *strcpy (char *, const char *) __attribute__ ((__pure__));
+__SIZE_TYPE__ strlen (const char *);
+void *malloc (__SIZE_TYPE__);
+
+char a[20];
+
+char *
+foo (void)
+{
+  __SIZE_TYPE__ b = strlen (a);
+  char *c = malloc (b);
+  return strcpy (c, a);
+}

Jakub


[PATCH] Fix ix86_fp_cmp_code_to_pcmp_immediate (PR target/71559)

2016-06-20 Thread Jakub Jelinek
Hi!

As discussed in the PR, this function is missing a lot of comparison codes
that can validly appear there, and gives wrong values for the others
except for NE.
This patch makes those values match what %D3 emits for the AVX vcmp*p{s,d},
there is some controversy on whether UN{GT,GE,LT,LE,EQ} and/or LTGT should
raise exceptions or not, but that should be handled later on also together
with the scalar code (where we never raise exceptions), SSE, AVX and this.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-06-20  Jakub Jelinek  

PR target/71559
* config/i386/i386.c (ix86_fp_cmp_code_to_pcmp_immediate): Fix up
returned values and add UN*/LTGT/*ORDERED cases with values matching
D operand modifier on vcmp for AVX.

* gcc.target/i386/sse2-pr71559.c: New test.
* gcc.target/i386/avx-pr71559.c: New test.
* gcc.target/i386/avx512f-pr71559.c: New test.

--- gcc/config/i386/i386.c.jj   2016-06-20 10:36:29.489994876 +0200
+++ gcc/config/i386/i386.c  2016-06-20 12:07:37.311006144 +0200
@@ -23622,17 +23622,33 @@ ix86_fp_cmp_code_to_pcmp_immediate (enum
   switch (code)
 {
 case EQ:
-  return 0x08;
+  return 0x00;
 case NE:
   return 0x04;
 case GT:
-  return 0x16;
+  return 0x0e;
 case LE:
-  return 0x1a;
+  return 0x02;
 case GE:
-  return 0x15;
+  return 0x0d;
 case LT:
-  return 0x19;
+  return 0x01;
+case UNLE:
+  return 0x0a;
+case UNLT:
+  return 0x09;
+case UNGE:
+  return 0x05;
+case UNGT:
+  return 0x06;
+case UNEQ:
+  return 0x18;
+case LTGT:
+  return 0x0c;
+case ORDERED:
+  return 0x07;
+case UNORDERED:
+  return 0x03;
 default:
   gcc_unreachable ();
 }
--- gcc/testsuite/gcc.target/i386/sse2-pr71559.c.jj 2016-06-20 
12:10:27.621795187 +0200
+++ gcc/testsuite/gcc.target/i386/sse2-pr71559.c2016-06-20 
12:14:44.821457893 +0200
@@ -0,0 +1,73 @@
+/* PR target/71559 */
+/* { dg-do run { target sse2 } } */
+/* { dg-options "-O2 -ftree-vectorize -msse2" } */
+
+#ifndef PR71559_TEST
+#include "sse2-check.h"
+#define PR71559_TEST sse2_test
+#endif
+
+#define N 16
+float a[N] = { 5.0f, -3.0f, 1.0f, __builtin_nanf (""), 9.0f, 7.0f, -3.0f, 
-9.0f,
+   -3.0f, -5.0f, -9.0f, __builtin_nanf (""), 0.5f, -0.5f, 0.0f, 
0.0f };
+float b[N] = { -5.0f, 3.0f, 1.0f, 7.0f, 8.0f, 8.0f, -3.0f, __builtin_nanf (""),
+   -4.0f, -4.0f, -9.0f, __builtin_nanf (""), 0.0f, 0.0f, 0.0f, 
__builtin_nanf ("") };
+int c[N], d[N];
+
+#define FN(name, op) \
+void   \
+name (void)\
+{  \
+  int i;   \
+  for (i = 0; i < N; i++)  \
+c[i] = (op || d[i] > 37) ? 5 : 32; \
+}
+FN (eq, a[i] == b[i])
+FN (ne, a[i] != b[i])
+FN (gt, a[i] > b[i])
+FN (ge, a[i] >= b[i])
+FN (lt, a[i] < b[i])
+FN (le, a[i] <= b[i])
+FN (unle, !__builtin_isgreater (a[i], b[i]))
+FN (unlt, !__builtin_isgreaterequal (a[i], b[i]))
+FN (unge, !__builtin_isless (a[i], b[i]))
+FN (ungt, !__builtin_islessequal (a[i], b[i]))
+FN (uneq, !__builtin_islessgreater (a[i], b[i]))
+FN (ordered, !__builtin_isunordered (a[i], b[i]))
+FN (unordered, __builtin_isunordered (a[i], b[i]))
+
+#define TEST(name, GT, LT, EQ, UO) \
+  name (); \
+  for (i = 0; i < N; i++)  \
+{  \
+  int v;   \
+  switch (i % 4)   \
+   {   \
+   case 0: v = GT ? 5 : 32; break; \
+   case 1: v = LT ? 5 : 32; break; \
+   case 2: v = EQ ? 5 : 32; break; \
+   case 3: v = UO ? 5 : 32; break; \
+   }   \
+  if (c[i] != v)   \
+   __builtin_abort (); \
+}
+
+void
+PR71559_TEST (void)
+{
+  int i;
+  asm volatile ("" : : "g" (a), "g" (b), "g" (c), "g" (d) : "memory");
+  TEST (eq, 0, 0, 1, 0)
+  TEST (ne, 1, 1, 0, 1)
+  TEST (gt, 1, 0, 0, 0)
+  TEST (ge, 1, 0, 1, 0)
+  TEST (lt, 0, 1, 0, 0)
+  TEST (le, 0, 1, 1, 0)
+  TEST (unle, 0, 1, 1, 1)
+  TEST (unlt, 0, 1, 0, 1)
+  TEST (unge, 1, 0, 1, 1)
+  TEST (ungt, 1, 0, 0, 1)
+  TEST (uneq, 0, 0, 1, 1)
+  TEST (ordered, 1, 1, 1, 0)
+  TEST (unordered, 0, 0, 0, 1)
+}
--- gcc/testsuite/gcc.target/i386/avx-pr71559.c.jj  2016-06-20 
12:10:44.028582301 +0200
+++ gcc/testsuite/gcc.target/i386/avx-pr71559.c 2016-06-20 12:14:32.627616114 
+0200
@@ -0,0 +1,8 @@
+/* PR target/71559 */
+/* { dg-do run { target avx } } */
+/* { dg-options "-O2 -ftree-vectorize -mavx" } */
+
+#include "avx-check.h"
+#define PR71559_TEST avx_test
+
+#include "sse2-pr71559.c"
--- gcc/testsuite/gcc.target/i386/avx512f-pr71559.c.jj  2016-06-20 
12:11:32.812949299 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr71559.c 2016-06-20 

Re: [PATCH] config-list.mk AIX update

2016-06-20 Thread Jan-Benedict Glaw
Hi David,

On Mon, 2016-06-20 14:13:33 -0400, David Edelsohn  wrote:
> This patch removes obsolete AIX 4.3, 5.1 and 5.2 configurations and
> adds AIX 7.1 configuration.  GCC does not yet differentiate AIX 7.2,
> so I did not include it.
> 
> Okay?

Of course, you're AIX maintainer after all. ;-)

  Just a nitpick: Right now, the old AIX configurations weren't marked
as --enable-obsolete targets in gcc/config.gcc .  So I suggest to add
those there as well.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
 Signature of:If it doesn't work, force it.
 the second  :   If it breaks, it needed replacing anyway.


signature.asc
Description: Digital signature


[PATCH] config-list.mk AIX update

2016-06-20 Thread David Edelsohn
This patch removes obsolete AIX 4.3, 5.1 and 5.2 configurations and
adds AIX 7.1 configuration.  GCC does not yet differentiate AIX 7.2,
so I did not include it.

Okay?

Thanks, David

* config-list.mk: Remove rs6000-ibm-aix4.3, rs6000-ibm-aix5.1,
rs6000-ibm-aix5.2.
Rename rs6000-ibm-aix6.0 as rs6000-ibm-aix6.1.  Add rs6000-ibm-aix7.1.

Index: config-list.mk
===
--- config-list.mk  (revision 237600)
+++ config-list.mk  (working copy)
@@ -81,8 +81,8 @@
   powerpc-linux_paired powerpc64-linux_altivec \
   powerpc-wrs-vxworks powerpc-wrs-vxworksae powerpc-wrs-vxworksmils \
   powerpc-lynxos powerpcle-elf \
-  powerpcle-eabisim powerpcle-eabi rs6000-ibm-aix4.3 rs6000-ibm-aix5.1.0 \
-  rs6000-ibm-aix5.2.0 rs6000-ibm-aix5.3.0 rs6000-ibm-aix6.0 \
+  powerpcle-eabisim powerpcle-eabi \
+  rs6000-ibm-aix5.3.0 rs6000-ibm-aix6.1 rs6000-ibm-aix7.1 \
   rl78-elf rx-elf s390-linux-gnu s390x-linux-gnu s390x-ibm-tpf sh-elf \
   shle-linux sh-netbsdelf sh-superh-elf \
   sh-rtems sh-wrs-vxworks sparc-elf \


Re: [Patch, testsuite] Mark some more tests as UNSUPPORTED for avr

2016-06-20 Thread Mike Stump
On Jun 20, 2016, at 2:13 AM, Senthil Kumar Selvaraj 
 wrote:
> 
> This patch fixes some bogus failures for the avr target by requiring
> int32plus or ptr32plus support.
> 
> Ok for trunk?

Ok.

If you feel comfortable making these sort of "obvious" changes, you can post 
and check them in without asking for a formal review.  You can always ask for a 
review, if you'd like one.

> 2016-06-20  Senthil Kumar Selvaraj  
> 
>   * c-c++-common/pr68657-1.c: Require ptr32plus support.
>   * c-c++-common/pr68657-2.c: Likewise.
>   * c-c++-common/pr68657-3.c: Likewise.
>   * gcc.dg/torture/pr69714.c: Require int32plus support.
>   * gcc.dg/torture/pr70025.c: Likewise.
>   * gcc.dg/torture/pr70083.c: Likewise.
>   * gcc.dg/torture/pr70542.c: Likewise.
>   * gcc.dg/torture/pr70935.c: Require ptr32plus support.



Re: [PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread Ilya Enkovich
On 20 Jun 09:45, H.J. Lu wrote:
> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich  wrote:
> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak :
> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu  wrote:
> >>> TImode register referenced in debug insn can be converted to V1TImode
> >>> by scalar to vector optimization.  We need to convert a debug insn if
> >>> it has a variable in a TImode register.
> >>>
> >>> Tested on x86-64.  OK for trunk?
> >>
> >> Ilya, does this approach look good to you? Also, does DImode STV
> >> conversion need similar handling of debug insns?
> >
> > DImode conversion doesn't change register mode (i.e. never calls
> > PUT_MODE for registers).  That keeps debug instructions valid.
> >
> > Overall I don't like the idea of having debug insns in candidates
> > set and in chains.  Looks like it is possible to have a chain
> > consisting of a debug insn only which is weird (otherwise I don't
> > see why we may return false in timode_scalar_chain::convert_insn).
> 
> Yes, it can happen:
> 
> (insn 11 8 12 2 (parallel [
> (set (reg/v:TI 91 [  ])
> (plus:TI (reg/v:TI 92 [ a ])
> (reg/v:TI 96 [ b ])))
> (clobber (reg:CC 17 flags))
> ]) y.i:5 210 {*addti3_doubleword}
>  (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil)))
> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [  ])) y.i:5 -1
>  (nil))
> 
> 
> > What about other possible register uses?  If debug insns are added
> > to candidates then NONDEBUG_INSN_P check for uses in
> > timode_check_non_convertible_regs becomes invalid, right?
> 
> Debug insn has no impact on STV decision.  We just need to convert
> register referenced in debug insn from V1TImode to TImode in
> timode_scalar_chain::convert_insn.
> 
> > If we have (or want) to fix some register uses then it's probably
> > would be better to visit register uses when we convert its mode
> > and make required fix-ups.  It seems better to me to not involve
> > debug insns in analysis phase.
> 
> Here is the updated patch to add debug insn, which references the
> TImode register which will be converted to V1TImode to queue.
> I am testing it now.
> 

You still count and dump debug insns as optimized ones.  Also we
try to use virtual functions to cover differences in DI and TI 
optimizations and introducing additional TARGET_64BIT in common
STV code is undesirable.

Also your conversion now depends on instructions processing order.
You will fail to process debug insn before non-debug ones. Required
order is not guaranteed because processing depends on instruction
UIDs only.

I propose to modify transformation phase only like in the patch
(untested) below.  I rely on your code which assumes the only
possible usage in debug insn is VAR_LOCATION.

Thanks,
Ilya
--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5e5e12..ec955f0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain
 
  private:
   void mark_dual_mode_def (df_ref def);
+  void fix_debug_reg_uses (rtx reg);
   void convert_insn (rtx_insn *insn);
   /* We don't convert registers to difference size.  */
   void convert_registers () {}
@@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
   df_insn_rescan (insn);
 }
 
+/* Fix uses of converted REG in debug insns.  */
+
+void
+timode_scalar_chain::fix_debug_reg_uses (rtx reg)
+{
+  df_ref ref;
+  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG (ref))
+{
+  rtx_insn *insn = DF_REF_INSN (ref);
+
+  if (DEBUG_INSN_P (insn))
+   {
+ /* It must be a debug insn with a TImode variable in register.  */
+ rtx val = PATTERN (insn);
+ gcc_assert (GET_MODE (val) == TImode
+ && GET_CODE (val) == VAR_LOCATION);
+ rtx loc = PAT_VAR_LOCATION_LOC (val);
+ gcc_assert (REG_P (loc)
+ && GET_MODE (loc) == V1TImode);
+ /* Convert V1TImode register, which has been updated by a SET
+ insn before, to SUBREG TImode.  */
+ PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
+ df_insn_rescan (insn);
+ return;
+   }
+}
+}
+
 /* Convert INSN from TImode to V1T1mode.  */
 
 void
@@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)
rtx tmp = find_reg_equal_equiv_note (insn);
if (tmp)
  PUT_MODE (XEXP (tmp, 0), V1TImode);
+   PUT_MODE (dst, V1TImode);
+   fix_debug_reg_uses (dst);
   }
-  /* FALLTHRU */
+  break;
 case MEM:
   PUT_MODE (dst, V1TImode);
   break;


Re: Update probabilities in predict.def to match reality

2016-06-20 Thread Renlin Li

Hi,

On 08/06/16 11:21, Andreas Schwab wrote:

Jan Hubicka  writes:


Bootstrapped/regtested x86_64-linux, will commit it later today.


FAIL: gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " w?* " 7


This fails for all arm and aarch64 targets as well since the commit.

Regards,
Renlin Li



Andreas.



[PATCH, i386, AVX-512ER] vrcp28ps auto generation

2016-06-20 Thread Ilya Verbin
Hi!

This patch emits vrcp28ps and vmulps istructions for ix86_emit_swdivsf.
The relative error is < 2^-23, so no additional iteration is necessary.
Regtested using various benchmarks on a AVX-512ER machine.  OK for trunk?


gcc/
* config/i386/i386.c (ix86_emit_swdivsf): Emit vrcp28ps.
gcc/testsuite/
* gcc.target/i386/avx512er-vrcp28ps-3.c: New test.
* gcc.target/i386/avx512er-vrcp28ps-4.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56a5b9c..8e0bf26 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -48674,8 +48674,19 @@ void ix86_emit_swdivsf (rtx res, rtx a, rtx b, 
machine_mode mode)
 
   /* x0 = rcp(b) estimate */
   if (mode == V16SFmode || mode == V8DFmode)
-emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
-   UNSPEC_RCP14)));
+{
+  if (TARGET_AVX512ER)
+   {
+ emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
+ UNSPEC_RCP28)));
+ /* res = a * x0 */
+ emit_insn (gen_rtx_SET (res, gen_rtx_MULT (mode, a, x0)));
+ return;
+   }
+  else
+   emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
+   UNSPEC_RCP14)));
+}
   else
 emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
UNSPEC_RCP)));
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c 
b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c
new file mode 100644
index 000..e08bea4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-3.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512er } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
+
+#include "avx512er-check.h"
+
+#define MAX 1000
+#define EPS 0.1
+
+__attribute__ ((noinline, optimize (0)))
+void static
+compute_rcp_ref (float *a, float *b, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = a[i] / b[i];
+}
+
+__attribute__ ((noinline))
+void static
+compute_rcp_exp (float *a, float *b, float *r)
+{
+  for (int i = 0; i < MAX; i++)
+r[i] = a[i] / b[i];
+}
+
+void static
+avx512er_test (void)
+{
+  float a[MAX];
+  float b[MAX];
+  float ref[MAX];
+  float exp[MAX];
+
+  for (int i = 0; i < MAX; i++)
+{
+  a[i] = 179.345 - 6.5645 * i;
+  b[i] = 8765.987 - 8.6756 * i;
+}
+
+  compute_rcp_ref (a, b, ref);
+  compute_rcp_exp (a, b, exp);
+
+  for (int i = 0; i < MAX; i++)
+{
+  float rel_err = (ref[i] - exp[i]) / ref[i];
+  rel_err = rel_err > 0.0 ? rel_err : -rel_err;
+  if (rel_err > EPS)
+   abort ();
+}
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c 
b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c
new file mode 100644
index 000..2c76d96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrcp28ps-4.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx512er" } */
+
+#include "avx512er-vrcp28ps-3.c"
+
+/* { dg-final { scan-assembler-times "vrcp28ps\[^\n\r\]*zmm\[0-9\]+(?:\n|\[ 
\\t\]+#)" 1 } } */


  -- Ilya


Re: [patch, avr] Fix PR30417: Wrap -Tdata into %{!Tdata:...}.

2016-06-20 Thread Denis Chertykov
2016-06-20 16:36 GMT+03:00 Georg-Johann Lay :
> This patch allows to specify -Tdata and -Ttext on the command line for MCUs
> where the specs file sets these options.  For -mmcu=atmega88 for example,
> the respective specs reads:
>
> *link_data_start:
> -Tdata 0x800100
>
> and the patch changes this to
>
> *link_data_start:
> %{!Tdata:-Tdata 0x800100}
>
> Same for *link_text_start and -Ttext.
>
> Ok for trunk and backport?
>
> Johann
>
>
> PR target/30417
> * config/avr/gen-avr-mmcu-specs.c (print_mcu):
> [*link_data_start]: Wrap -Tdata into %{!Tdata:...}.
> [*link_text_start]: Wrap -Ttext into %{!Ttext:...}.
>

Ok. Please apply.


[PATCH] x86-64: Load external function address via GOT slot

2016-06-20 Thread H.J. Lu
Hi,

This patch implements the alternate code sequence recommended in

https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI

to load external function address via GOT slot with

movq func@GOTPCREL(%rip), %rax

so that linker won't create an PLT entry for extern function
address.

Tested on x86-64.  OK for trunk?


H.J.
--
gcc/

PR target/67400
* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): New.
* config/i386/i386.c (ix86_force_load_from_GOT_p): New function.
(ix86_legitimate_address_p): Allow UNSPEC_GOTPCREL if
ix86_force_load_from_GOT_p returns true.
(ix86_print_operand_address): Support UNSPEC_GOTPCREL if
ix86_force_load_from_GOT_p returns true.
(ix86_expand_move): Load the external function address via the
GOT slot if ix86_force_load_from_GOT_p returns true.
* config/i386/predicates.md (x86_64_immediate_operand): Return
false if ix86_force_load_from_GOT_p returns true.

gcc/testsuite/

PR target/67400
* gcc.target/i386/pr67400-1.c: New test.
* gcc.target/i386/pr67400-2.c: Likewise.
* gcc.target/i386/pr67400-3.c: Likewise.
* gcc.target/i386/pr67400-4.c: Likewise.
* gcc.target/i386/pr67400-5.c: Likewise.
* gcc.target/i386/pr67400-6.c: Likewise.
---
 gcc/config/i386/i386-protos.h |  1 +
 gcc/config/i386/i386.c| 51 +++
 gcc/config/i386/predicates.md |  4 +++
 gcc/testsuite/gcc.target/i386/pr67400-1.c | 13 
 gcc/testsuite/gcc.target/i386/pr67400-2.c | 14 +
 gcc/testsuite/gcc.target/i386/pr67400-3.c | 16 ++
 gcc/testsuite/gcc.target/i386/pr67400-4.c | 13 
 gcc/testsuite/gcc.target/i386/pr67400-5.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr67400-6.c | 13 
 9 files changed, 136 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-6.c

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 9fd14f6..8130161 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -70,6 +70,7 @@ extern bool ix86_expand_set_or_movmem (rtx, rtx, rtx, rtx, 
rtx, rtx,
 extern bool constant_address_p (rtx);
 extern bool legitimate_pic_operand_p (rtx);
 extern bool legitimate_pic_address_disp_p (rtx);
+extern bool ix86_force_load_from_GOT_p (rtx);
 extern void print_reg (rtx, int, FILE*);
 extern void ix86_print_operand (FILE *, rtx, int);
 
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56a5b9c..c8c5081 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -15182,6 +15182,24 @@ ix86_legitimate_constant_p (machine_mode mode, rtx x)
   return true;
 }
 
+/* True if operand X should be loaded from GOT.  */
+
+bool
+ix86_force_load_from_GOT_p (rtx x)
+{
+  /* External function symbol should be loaded via the GOT slot for
+ -fno-plt.  */
+  return (!flag_plt
+ && !flag_pic
+ && ix86_cmodel != CM_LARGE
+ && TARGET_64BIT
+ && !TARGET_PECOFF
+ && !TARGET_MACHO
+ && GET_CODE (x) == SYMBOL_REF
+ && SYMBOL_REF_FUNCTION_P (x)
+ && !SYMBOL_REF_LOCAL_P (x));
+}
+
 /* Determine if it's legal to put X into the constant pool.  This
is not possible for the address of thread-local symbols, which
is checked above.  */
@@ -15560,6 +15578,10 @@ ix86_legitimate_address_p (machine_mode, rtx addr, 
bool strict)
return false;
 
  case UNSPEC_GOTPCREL:
+   gcc_assert (flag_pic
+   || ix86_force_load_from_GOT_p (XVECEXP (XEXP (disp, 0), 
0, 0)));
+   goto is_legitimate_pic;
+
  case UNSPEC_PCREL:
gcc_assert (flag_pic);
goto is_legitimate_pic;
@@ -18130,6 +18152,12 @@ ix86_print_operand_address_as (FILE *file, rtx addr,
}
   else if (flag_pic)
output_pic_addr_const (file, disp, 0);
+  else if (GET_CODE (disp) == CONST
+  && GET_CODE (XEXP (disp, 0)) == UNSPEC
+  && (XINT (XEXP (disp, 0), 1) == UNSPEC_GOTPCREL
+  || XINT (XEXP (disp, 0), 1) == UNSPEC_GOT)
+  && ix86_force_load_from_GOT_p (XVECEXP (XEXP (disp, 0), 0, 0)))
+   output_pic_addr_const (file, XEXP (disp, 0), code);
   else
output_addr_const (file, disp);
 }
@@ -19448,6 +19476,29 @@ ix86_expand_move (machine_mode mode, rtx operands[])
  op1 = convert_to_mode (mode, op1, 1);
}
}
+   }
+  else if (ix86_force_load_from_GOT_p (op1))
+{
+  /* Load the external function address via the GOT slot to
+avoid PLT.  */
+  op1 = 

Re: [PATCH 0/6] remove some usage of rtx_{insn,expr}_list

2016-06-20 Thread Bernd Schmidt

On 06/20/2016 12:22 PM, tbsaunde+...@tbsaunde.org wrote:

In theory I would expect if anything this helps performance since it isn't
necessary to malloc every time a node is added, however the data is less clear.


Well, we have alloc pools for these lists, so a malloc is not needed for 
every node.



fold const O2 new
real0m5.034s
user0m3.408s
sys 0m0.364s

fold const O2 old
real0m4.012s
user0m3.420s
sys 0m0.340s


So that's a second more in real time - was the machine very busy at the 
time you ran these tests so that these aren't meaningful, or is there a 
need to investigate this?



So a couple got about .3s slower, and others got about .1 faster, I'm not
really sure but inclined to say any change is too small to easily measure.

bootstrapped + regtested patches individually on x86_64-linux-gnu, ok?


Modulo the question about compile times I think patches 1-4 are ok, In 5 
and 6 I see explicit for loops instead of FOR_EACH macros; I'm curious 
as to the reason.



Bernd



Re: [PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread H.J. Lu
On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich  wrote:
> 2016-06-20 16:39 GMT+03:00 Uros Bizjak :
>> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu  wrote:
>>> TImode register referenced in debug insn can be converted to V1TImode
>>> by scalar to vector optimization.  We need to convert a debug insn if
>>> it has a variable in a TImode register.
>>>
>>> Tested on x86-64.  OK for trunk?
>>
>> Ilya, does this approach look good to you? Also, does DImode STV
>> conversion need similar handling of debug insns?
>
> DImode conversion doesn't change register mode (i.e. never calls
> PUT_MODE for registers).  That keeps debug instructions valid.
>
> Overall I don't like the idea of having debug insns in candidates
> set and in chains.  Looks like it is possible to have a chain
> consisting of a debug insn only which is weird (otherwise I don't
> see why we may return false in timode_scalar_chain::convert_insn).

Yes, it can happen:

(insn 11 8 12 2 (parallel [
(set (reg/v:TI 91 [  ])
(plus:TI (reg/v:TI 92 [ a ])
(reg/v:TI 96 [ b ])))
(clobber (reg:CC 17 flags))
]) y.i:5 210 {*addti3_doubleword}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [  ])) y.i:5 -1
 (nil))


> What about other possible register uses?  If debug insns are added
> to candidates then NONDEBUG_INSN_P check for uses in
> timode_check_non_convertible_regs becomes invalid, right?

Debug insn has no impact on STV decision.  We just need to convert
register referenced in debug insn from V1TImode to TImode in
timode_scalar_chain::convert_insn.

> If we have (or want) to fix some register uses then it's probably
> would be better to visit register uses when we convert its mode
> and make required fix-ups.  It seems better to me to not involve
> debug insns in analysis phase.

Here is the updated patch to add debug insn, which references the
TImode register which will be converted to V1TImode to queue.
I am testing it now.

> Also I don't think debug insns should be accounted as optimized
> instructions because we would get different number of optimized
> instructions depending on debug info availability which may be
> inconvenient for dump scans (and it is not a real instruction
> optimization).
>
> Thanks,
> Ilya
>
>>
>> Uros.
>>
>>>
>>> H.J.
>>> 
>>> gcc/
>>>
>>> PR target/71549
>>> * config/i386/i386.c (timode_scalar_to_vector_candidate_p):
>>> Return true if debug insn has a variable in TImode register.
>>> (timode_remove_non_convertible_regs): Skip debug insn.
>>> (scalar_chain::convert_insn): Change return type to bool.
>>> (scalar_chain::add_insn): Don't check registers in debug insn.
>>> (dimode_scalar_chain::convert_insn): Change return type to bool
>>> and always return true.
>>> (timode_scalar_chain::convert_insn): Change return type to bool.
>>> Convert V1TImode register to SUBREG TImode in debug insn.  Return
>>> false if debug insn isn't converted.  Otherwise, return true.
>>> (scalar_chain::convert): Increment converted_insns only if
>>> convert_insn returns true.
>>>



-- 
H.J.
From 4229b304b98dff0190922ddd5270a1389321313b Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 19 Jun 2016 12:47:45 -0700
Subject: [PATCH] Convert V1TImode register to TImode in debug insn

TImode register referenced in debug insn can be converted to V1TImode
by scalar to vector optimization.  We need to convert a debug insn if
it references TImode register which will be converted to V1TImode.

gcc/

	PR target/71549
	* config/i386/i386.c (scalar_chain::analyze_register_chain): In
	64-bit mode, add debug insn, which references the register, to
	queue.
	(timode_scalar_chain::convert_insn): Convert V1TImode register
	to SUBREG TImode in debug insn.

gcc/testsuite/

	PR target/71549
	* gcc.target/i386/pr71549.c: New test.
---
 gcc/config/i386/i386.c  | 32 +++-
 gcc/testsuite/gcc.target/i386/pr71549.c | 24 
 2 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56a5b9c..cea632c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3229,8 +3229,22 @@ scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref)
   for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
 {
   unsigned uid = DF_REF_INSN_UID (chain->ref);
+  rtx_insn *insn = DF_REF_INSN (chain->ref);
 
-  if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
+  if (TARGET_64BIT && DEBUG_INSN_P (insn))
+	{
+	  /* In 64-bit mode, if a variable is put in a TImode register,
+	 which may be converted to V1TImode, we need to convert
+	 this 

Re: [Patch, Fortran] PR71194 - Fix ICE with pointer assignment

2016-06-20 Thread Paul Richard Thomas
Dear Tobias,

Impeccably done, as always :-) OK for trunk.

Thanks for the patch.

Cheers

Paul

On 20 June 2016 at 14:10, Tobias Burnus
 wrote:
> Dear all,
>
> the issue occurs only if the RHS of a pointer assignment is a function and
> the ICE is only triggered when a rank remapping is needed.
>
> gfc_conv_expr_descriptor calls for a expr2 gfc_conv_procedure_call, which
> sets "se.expr" to NULL_TREE - and the code later tries to access it.
>
> The code correctly sets rse.expr to "tmp", but that does not help as all
> actions were wrongly done on lse before. Solution: Stuff the RHS expr2 into
> rse not into lse.
>
> Build and regtested* on x86-64-gnu-linux.
> OK for the trunk?
>
> Tobias
>
> (* gfortran.dg/graphite/pr68279.f90 fails but is a known PR,
> gfortran.dg/vect/vect-8.f90 fails but not only for me, and
> gfortran.dg/guality/pr41558.f90 never worked on that system)



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein


Re: [PING] [PATCH] c/69507 - bogus warning: ISO C does not allow ‘__alignof__ (expression)’

2016-06-20 Thread Martin Sebor

Since this patch was committed, I am now seeing failures on:
gcc.dg/gnu99-const-expr-1.c
gcc.dg/gnu99-static-1.c

(targets arm, aarch64, I don't think that it should matter?)

Can you have  a look?


Sorry about that.  I missed the test updates in my initial patch.
I've committed them in r237606.

Martin



Re: RFC: pass to warn on questionable uses of alloca().

2016-06-20 Thread Joseph Myers
On Sat, 18 Jun 2016, Martin Sebor wrote:

> the function regardless of the value of its argument).  At
> the same time, it seems that an even more reliable solution
> than pointing out potentially unsafe calls to the function
> and relying on users to modify their code to use malloc for
> large/unbounded allocations would be to let GCC do it for
> them automatically (i.e., in response to some other option,
> emit a call to malloc instead, and insert a call to free when
> appropriate).

Note that such an option would not be usable for the original motivating 
case of glibc, because in code that's meant to be async-signal-safe, 
alloca and VLAs can be used, but malloc cannot.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c/71552 - Confusing error for incorrect struct initialization

2016-06-20 Thread Joseph Myers
On Sat, 18 Jun 2016, Martin Sebor wrote:

> The attached patch slightly changes the order in which initializers
> are checked for type compatibility to issue the same error for static
> initializers of incompatible types as for automatic objects, rather
> than rejecting the former for their lack of constness first.

OK, presuming the patch has passed the usual testing.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: 2->2 combine patch

2016-06-20 Thread Segher Boessenkool
On Mon, Jun 20, 2016 at 02:39:06PM +0100, Kyrill Tkachov wrote:
> >So I tried out the patch below.  It decreases code size on most targets
> >(mostly fixed length insn targets), and increases it a small bit on some
> >variable length insn targets (doing an op twice, instead of doing it once
> >and doing a move).  It looks to be all good there too, but there are so
> >many changes that it is almost impossible to really check.
> >
> >So: can people try this out with their favourite benchmarks, please?
> 
> I hope to give this a run on AArch64 but I'll probably manage to get to it 
> only next week.
> In the meantime I've had a quick look at some SPEC2006 codegen on aarch64.
> Some benchmarks decrease in size, others increase. One recurring theme I 
> spotted is
> shifts being repeatedly combined with arithmetic operations rather than 
> being computed
> once and reusing the result. For example:
> lslx30, x15, 3
> addx4, x5, x30
> addx9, x7, x30
> addx24, x8, x30
> addx10, x0, x30
> addx2, x22, x30
> 
> becoming (modulo regalloc fluctuations):
> addx14, x2, x15, lsl 3
> addx13, x22, x15, lsl 3
> addx21, x4, x15, lsl 3
> addx6, x0, x15, lsl 3
> addx3, x30, x15, lsl 3
> 
> which, while saving one instruction can be harmful overall because the 
> extra shift operation
> in the arithmetic instructions can increase the latency of the instruction. 
> I believe the aarch64
> rtx costs should convey this information. Do you expect RTX costs to gate 
> this behaviour?

Yes, RTX costs are used for *all* of combine's combinations.  So it seems
your add,lsl patterns are the same cost as plain add?  If it were more
expensive, combine would reject this combination.

> Some occurrences that hurt code size look like:
> cmpx8, x11, asr 5
> 
> being transformed into:
> asrx12, x11, 5
> cmpx12, x8, uxtw //zero-extend x8
> with the user of the condition code inverted to match the change in order 
> of operands
> to the comparisons.
> I haven't looked at the RTL dumps yet to figure out why this is happening, 
> it could be a backend
> RTL representation issue.

That could be a target thing yes, hard to tell; it's not clear to me
what combination is made here (if any).


Segher


Re: [PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread Ilya Enkovich
2016-06-20 16:39 GMT+03:00 Uros Bizjak :
> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu  wrote:
>> TImode register referenced in debug insn can be converted to V1TImode
>> by scalar to vector optimization.  We need to convert a debug insn if
>> it has a variable in a TImode register.
>>
>> Tested on x86-64.  OK for trunk?
>
> Ilya, does this approach look good to you? Also, does DImode STV
> conversion need similar handling of debug insns?

DImode conversion doesn't change register mode (i.e. never calls
PUT_MODE for registers).  That keeps debug instructions valid.

Overall I don't like the idea of having debug insns in candidates
set and in chains.  Looks like it is possible to have a chain
consisting of a debug insn only which is weird (otherwise I don't
see why we may return false in timode_scalar_chain::convert_insn).

What about other possible register uses?  If debug insns are added
to candidates then NONDEBUG_INSN_P check for uses in
timode_check_non_convertible_regs becomes invalid, right?

If we have (or want) to fix some register uses then it's probably
would be better to visit register uses when we convert its mode
and make required fix-ups.  It seems better to me to not involve
debug insns in analysis phase.

Also I don't think debug insns should be accounted as optimized
instructions because we would get different number of optimized
instructions depending on debug info availability which may be
inconvenient for dump scans (and it is not a real instruction
optimization).

Thanks,
Ilya

>
> Uros.
>
>>
>> H.J.
>> 
>> gcc/
>>
>> PR target/71549
>> * config/i386/i386.c (timode_scalar_to_vector_candidate_p):
>> Return true if debug insn has a variable in TImode register.
>> (timode_remove_non_convertible_regs): Skip debug insn.
>> (scalar_chain::convert_insn): Change return type to bool.
>> (scalar_chain::add_insn): Don't check registers in debug insn.
>> (dimode_scalar_chain::convert_insn): Change return type to bool
>> and always return true.
>> (timode_scalar_chain::convert_insn): Change return type to bool.
>> Convert V1TImode register to SUBREG TImode in debug insn.  Return
>> false if debug insn isn't converted.  Otherwise, return true.
>> (scalar_chain::convert): Increment converted_insns only if
>> convert_insn returns true.
>>
>> gcc/testsuite/
>>
>> PR target/71549
>> * gcc.target/i386/pr71549.c: New test.
>> ---
>>  gcc/config/i386/i386.c  | 58 
>> -
>>  gcc/testsuite/gcc.target/i386/pr71549.c | 24 ++
>>  2 files changed, 74 insertions(+), 8 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 56a5b9c..e17fc53 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -2845,6 +2845,16 @@ dimode_scalar_to_vector_candidate_p (rtx_insn *insn)
>>  static bool
>>  timode_scalar_to_vector_candidate_p (rtx_insn *insn)
>>  {
>> +  if (DEBUG_INSN_P (insn))
>> +{
>> +  /* If a variable is put in a TImode register, which may be
>> +converted to V1TImode, we need to convert this debug insn.  */
>> +  rtx val = PATTERN (insn);
>> +  return (GET_MODE (val) == TImode
>> + && GET_CODE (val) == VAR_LOCATION
>> + && REG_P (PAT_VAR_LOCATION_LOC (val)));
>> +}
>> +
>>rtx def_set = single_set (insn);
>>
>>if (!def_set)
>> @@ -3012,7 +3022,12 @@ timode_remove_non_convertible_regs (bitmap candidates)
>>
>>EXECUTE_IF_SET_IN_BITMAP (candidates, 0, id, bi)
>>  {
>> -  rtx def_set = single_set (DF_INSN_UID_GET (id)->insn);
>> +  rtx_insn *insn = DF_INSN_UID_GET (id)->insn;
>> +  /* Debug insn isn't a SET insn.  */
>> +  if (DEBUG_INSN_P (insn))
>> +   continue;
>> +
>> +  rtx def_set = single_set (insn);
>>rtx dest = SET_DEST (def_set);
>>rtx src = SET_SRC (def_set);
>>
>> @@ -3111,7 +3126,7 @@ class scalar_chain
>>void add_insn (bitmap candidates, unsigned insn_uid);
>>void analyze_register_chain (bitmap candidates, df_ref ref);
>>virtual void mark_dual_mode_def (df_ref def) = 0;
>> -  virtual void convert_insn (rtx_insn *insn) = 0;
>> +  virtual bool convert_insn (rtx_insn *insn) = 0;
>>virtual void convert_registers () = 0;
>>  };
>>
>> @@ -3123,7 +3138,7 @@ class dimode_scalar_chain : public scalar_chain
>>void mark_dual_mode_def (df_ref def);
>>rtx replace_with_subreg (rtx x, rtx reg, rtx subreg);
>>void replace_with_subreg_in_insn (rtx_insn *insn, rtx reg, rtx subreg);
>> -  void convert_insn (rtx_insn *insn);
>> +  bool convert_insn (rtx_insn *insn);
>>void convert_op (rtx *op, rtx_insn *insn);
>>void convert_reg (unsigned regno);
>>void make_vector_copies (unsigned regno);
>> @@ -3139,7 +3154,7 @@ class timode_scalar_chain : 

Re: [PATCH]Fix scan-tree-dump-times syntax errors in gcc.dg/tree-ssa/attr-hotcold-2.c

2016-06-20 Thread Jakub Jelinek
On Mon, Jun 20, 2016 at 03:07:20PM +0100, Renlin Li wrote:
> Okay to commit?
> 
> Regards,
> Renlin
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-06-20  Renlin Li  
> 
>   * gcc.dg/tree-ssa/attr-hotcold-2.c: Fix syntax errors.

This is obvious.  Please check it in.  Though, it also shows that
it couldn't have been properly tested.

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
> index 6623d9e..e2e8143 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
> @@ -18,8 +18,8 @@ void f(int x, int y)
>return;
>  }
>  
> -/* { dg-final { scan-tree-dump-times 1 "hot label heuristics" 1 
> "profile_estimate" } } */
> -/* { dg-final { scan-tree-dump-times 1 "cold label heuristics" 1 
> "profile_estimate" } } */
> +/* { dg-final { scan-tree-dump-times "hot label heuristics" 1 
> "profile_estimate" } } */
> +/* { dg-final { scan-tree-dump-times "cold label heuristics" 1 
> "profile_estimate" } } */
>  /* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0, freq 
> \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */
>  
>  /* Note: we're attempting to match some number > 6000, i.e. > 60%.


Jakub


[PATCH]Fix scan-tree-dump-times syntax errors in gcc.dg/tree-ssa/attr-hotcold-2.c

2016-06-20 Thread Renlin Li

Hi,

This is a simple patch to fix the syntax errors in dg-final directive 
lines within this test case.


According to the documentation, the syntax of this directive should be:
'''scan-tree-dump-times regex num suffix [{ target/xfail selector }]'''


Now the test case compilers Okay in arm environment. However, the last 
two checks seem failing. This is another issue.


Okay to commit?

Regards,
Renlin

gcc/testsuite/ChangeLog:

2016-06-20  Renlin Li  

* gcc.dg/tree-ssa/attr-hotcold-2.c: Fix syntax errors.

On 13/06/16 17:35, Kyrill Tkachov wrote:

Hi Honza,

On 07/06/16 20:27, Jan Hubicka wrote:

Hello,
Maritn Liska measured branch predictor hitrates on current tree and




In the testsuite I'm seeing:
ERROR: gcc.dg/tree-ssa/attr-hotcold-2.c: error executing dg-final:
syntax error in target selector "profile_estimate"

on aarch64-none-elf.
I think the hunk:
-/* { dg-final { scan-ipa-dump-times "block 4, loop depth 0, count 0,
freq 1\[^0-9\]" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times 1 "hot label heuristics" 1
"profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times 1 "cold label heuristics" 1
"profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0,
freq \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */

is buggy, should it be
-/* { dg-final { scan-ipa-dump-times "block 4, loop depth 0, count 0,
freq 1\[^0-9\]" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "hot label heuristics" 1
"profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "cold label heuristics" 1
"profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0,
freq \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */
?

With that change the test runs but still FAILs:
FAIL: gcc.dg/tree-ssa/attr-hotcold-2.c scan-tree-dump-times
profile_estimate "block 4, loop depth 0, count 0, freq [1-4][^0-9]" 1
FAIL: gcc.dg/tree-ssa/attr-hotcold-2.c scan-tree-dump-times
profile_estimate "block 5, loop depth 0, count 0, freq
[6-9][0-9][0-9][0-9]" 1

Thanks,
Kyrill
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
index 6623d9e..e2e8143 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
@@ -18,8 +18,8 @@ void f(int x, int y)
   return;
 }
 
-/* { dg-final { scan-tree-dump-times 1 "hot label heuristics" 1 "profile_estimate" } } */
-/* { dg-final { scan-tree-dump-times 1 "cold label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "hot label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "cold label heuristics" 1 "profile_estimate" } } */
 /* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0, freq \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */
 
 /* Note: we're attempting to match some number > 6000, i.e. > 60%.


Re: [AArch64] Give some new costs for Cortex-A53 floating-point operations

2016-06-20 Thread Richard Earnshaw (lists)
On 20/06/16 14:57, James Greenhalgh wrote:
> 
> Hi,
> 
> As recently done for Cortex-A57 [1], this patch rebases the floating-point
> cost table for Cortex-A53 to be relative to the cost of a floating-point move.
> I wrote a little more on the justification for doing this in the other patch,
> but in summary this is what other targets and sub-targets do, so we should
> fall in line with that.
> 
> Unlike the Cortex-A57 changes, this had no performance impact across
> Spec2000 and Spec2006. I'm posting it to keep the strategy for costs
> aligned between the two cores.
> 
> Bootstrapped on aarch64-none-linux-gnu and arm-none-linux-gnueabihf with
> no issues.
> 
> OK?
> 
> Thanks,
> James
> 
> [1]: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00251.html
> 
> ---
> 2016-06-20  James Greenhalgh  
> 
>   * config/arm/aarch-cost-tables.h (cortexa53_extra_costs): Make FP
>   costs relative to the cost of a register move.
> 

OK.

R.

> 
> 0001-AArch64-Give-some-new-costs-for-Cortex-A53-floating-.patch
> 
> 
> diff --git a/gcc/config/arm/aarch-cost-tables.h 
> b/gcc/config/arm/aarch-cost-tables.h
> index 5f42253..8bcfcb4 100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -191,35 +191,35 @@ const struct cpu_cost_table cortexa53_extra_costs =
>{
>  /* FP SFmode */
>  {
> -  COSTS_N_INSNS (15),/* div.  */
> -  COSTS_N_INSNS (3), /* mult.  */
> -  COSTS_N_INSNS (7), /* mult_addsub. */
> -  COSTS_N_INSNS (7), /* fma.  */
> -  COSTS_N_INSNS (3), /* addsub.  */
> -  COSTS_N_INSNS (1), /* fpconst. */
> -  COSTS_N_INSNS (2), /* neg.  */
> -  COSTS_N_INSNS (1), /* compare.  */
> -  COSTS_N_INSNS (3), /* widen.  */
> -  COSTS_N_INSNS (3), /* narrow.  */
> -  COSTS_N_INSNS (3), /* toint.  */
> -  COSTS_N_INSNS (3), /* fromint.  */
> -  COSTS_N_INSNS (3)  /* roundint.  */
> +  COSTS_N_INSNS (5), /* div.  */
> +  COSTS_N_INSNS (1), /* mult.  */
> +  COSTS_N_INSNS (2), /* mult_addsub.  */
> +  COSTS_N_INSNS (2), /* fma.  */
> +  COSTS_N_INSNS (1), /* addsub.  */
> +  0, /* fpconst.  */
> +  COSTS_N_INSNS (1), /* neg.  */
> +  0, /* compare.  */
> +  COSTS_N_INSNS (1), /* widen.  */
> +  COSTS_N_INSNS (1), /* narrow.  */
> +  COSTS_N_INSNS (1), /* toint.  */
> +  COSTS_N_INSNS (1), /* fromint.  */
> +  COSTS_N_INSNS (1)  /* roundint.  */
>  },
>  /* FP DFmode */
>  {
> -  COSTS_N_INSNS (30),/* div.  */
> -  COSTS_N_INSNS (3), /* mult.  */
> -  COSTS_N_INSNS (7), /* mult_addsub.  */
> -  COSTS_N_INSNS (7), /* fma.  */
> -  COSTS_N_INSNS (3), /* addsub.  */
> -  COSTS_N_INSNS (1), /* fpconst.  */
> -  COSTS_N_INSNS (2), /* neg.  */
> -  COSTS_N_INSNS (1), /* compare.  */
> -  COSTS_N_INSNS (3), /* widen.  */
> -  COSTS_N_INSNS (3), /* narrow.  */
> -  COSTS_N_INSNS (3), /* toint.  */
> -  COSTS_N_INSNS (3), /* fromint.  */
> -  COSTS_N_INSNS (3)  /* roundint.  */
> +  COSTS_N_INSNS (10),/* div.  */
> +  COSTS_N_INSNS (1), /* mult.  */
> +  COSTS_N_INSNS (2), /* mult_addsub.  */
> +  COSTS_N_INSNS (2), /* fma.  */
> +  COSTS_N_INSNS (1), /* addsub.  */
> +  0, /* fpconst.  */
> +  COSTS_N_INSNS (1), /* neg.  */
> +  0, /* compare.  */
> +  COSTS_N_INSNS (1), /* widen.  */
> +  COSTS_N_INSNS (1), /* narrow.  */
> +  COSTS_N_INSNS (1), /* toint.  */
> +  COSTS_N_INSNS (1), /* fromint.  */
> +  COSTS_N_INSNS (1)  /* roundint.  */
>  }
>},
>/* Vector */
> 



[AArch64] Give some new costs for Cortex-A53 floating-point operations

2016-06-20 Thread James Greenhalgh

Hi,

As recently done for Cortex-A57 [1], this patch rebases the floating-point
cost table for Cortex-A53 to be relative to the cost of a floating-point move.
I wrote a little more on the justification for doing this in the other patch,
but in summary this is what other targets and sub-targets do, so we should
fall in line with that.

Unlike the Cortex-A57 changes, this had no performance impact across
Spec2000 and Spec2006. I'm posting it to keep the strategy for costs
aligned between the two cores.

Bootstrapped on aarch64-none-linux-gnu and arm-none-linux-gnueabihf with
no issues.

OK?

Thanks,
James

[1]: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00251.html

---
2016-06-20  James Greenhalgh  

* config/arm/aarch-cost-tables.h (cortexa53_extra_costs): Make FP
costs relative to the cost of a register move.
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 5f42253..8bcfcb4 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -191,35 +191,35 @@ const struct cpu_cost_table cortexa53_extra_costs =
   {
 /* FP SFmode */
 {
-  COSTS_N_INSNS (15),	/* div.  */
-  COSTS_N_INSNS (3),	/* mult.  */
-  COSTS_N_INSNS (7),	/* mult_addsub. */
-  COSTS_N_INSNS (7),	/* fma.  */
-  COSTS_N_INSNS (3),	/* addsub.  */
-  COSTS_N_INSNS (1),	/* fpconst. */
-  COSTS_N_INSNS (2),	/* neg.  */
-  COSTS_N_INSNS (1),	/* compare.  */
-  COSTS_N_INSNS (3),	/* widen.  */
-  COSTS_N_INSNS (3),	/* narrow.  */
-  COSTS_N_INSNS (3),	/* toint.  */
-  COSTS_N_INSNS (3),	/* fromint.  */
-  COSTS_N_INSNS (3)		/* roundint.  */
+  COSTS_N_INSNS (5),	/* div.  */
+  COSTS_N_INSNS (1),	/* mult.  */
+  COSTS_N_INSNS (2),	/* mult_addsub.  */
+  COSTS_N_INSNS (2),	/* fma.  */
+  COSTS_N_INSNS (1),	/* addsub.  */
+  0,			/* fpconst.  */
+  COSTS_N_INSNS (1),	/* neg.  */
+  0,			/* compare.  */
+  COSTS_N_INSNS (1),	/* widen.  */
+  COSTS_N_INSNS (1),	/* narrow.  */
+  COSTS_N_INSNS (1),	/* toint.  */
+  COSTS_N_INSNS (1),	/* fromint.  */
+  COSTS_N_INSNS (1)		/* roundint.  */
 },
 /* FP DFmode */
 {
-  COSTS_N_INSNS (30),	/* div.  */
-  COSTS_N_INSNS (3),	/* mult.  */
-  COSTS_N_INSNS (7),	/* mult_addsub.  */
-  COSTS_N_INSNS (7),	/* fma.  */
-  COSTS_N_INSNS (3),	/* addsub.  */
-  COSTS_N_INSNS (1),	/* fpconst.  */
-  COSTS_N_INSNS (2),	/* neg.  */
-  COSTS_N_INSNS (1),	/* compare.  */
-  COSTS_N_INSNS (3),	/* widen.  */
-  COSTS_N_INSNS (3),	/* narrow.  */
-  COSTS_N_INSNS (3),	/* toint.  */
-  COSTS_N_INSNS (3),	/* fromint.  */
-  COSTS_N_INSNS (3)		/* roundint.  */
+  COSTS_N_INSNS (10),	/* div.  */
+  COSTS_N_INSNS (1),	/* mult.  */
+  COSTS_N_INSNS (2),	/* mult_addsub.  */
+  COSTS_N_INSNS (2),	/* fma.  */
+  COSTS_N_INSNS (1),	/* addsub.  */
+  0,			/* fpconst.  */
+  COSTS_N_INSNS (1),	/* neg.  */
+  0,			/* compare.  */
+  COSTS_N_INSNS (1),	/* widen.  */
+  COSTS_N_INSNS (1),	/* narrow.  */
+  COSTS_N_INSNS (1),	/* toint.  */
+  COSTS_N_INSNS (1),	/* fromint.  */
+  COSTS_N_INSNS (1)		/* roundint.  */
 }
   },
   /* Vector */


Re: [PING 2, PATCH] Remove xfail from thread_local-order2.C.

2016-06-20 Thread Dominik Vogt
Patch:
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01587.html

On Wed, Jan 27, 2016 at 10:39:44AM +0100, Dominik Vogt wrote:
> g++.dg/tls/thread_local-order2.C no longer fail with Glibc-2.18 or
> newer since this commit:
> 
>   2014-08-01  Zifei Tong  
> 
>   * libsupc++/atexit_thread.cc (HAVE___CXA_THREAD_ATEXIT_IMPL): Add
>   _GLIBCXX_ prefix to macro.
> 
>   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@213504 138bc75d-0d04-0410-96
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02091.html
> 
> So, is it time to remove the xfail from the test case?
> 
> Ciao
> 
> Dominik ^_^  ^_^
> 
> -- 
> 
> Dominik Vogt
> IBM Germany

> gcc/testsuite/ChangeLog
> 
>   * g++.dg/tls/thread_local-order2.C: Remove xfail.

> >From 0b0abbd2e6d9d8b6857622065bdcbdde31b5ddb0 Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Wed, 27 Jan 2016 09:54:07 +0100
> Subject: [PATCH] Remove xfail from thread_local-order2.C.
> 
> This should work with Glibc-2.18 or newer.
> ---
>  gcc/testsuite/g++.dg/tls/thread_local-order2.C | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C 
> b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> index f8df917..d3351e6 100644
> --- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> +++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> @@ -2,7 +2,6 @@
>  // that isn't reverse order of construction.  We need to move
>  // __cxa_thread_atexit into glibc to get this right.
>  
> -// { dg-do run { xfail *-*-* } }
>  // { dg-require-effective-target c++11 }
>  // { dg-add-options tls }
>  // { dg-require-effective-target tls_runtime }
> -- 
> 2.3.0

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread Uros Bizjak
On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu  wrote:
> TImode register referenced in debug insn can be converted to V1TImode
> by scalar to vector optimization.  We need to convert a debug insn if
> it has a variable in a TImode register.
>
> Tested on x86-64.  OK for trunk?

Ilya, does this approach look good to you? Also, does DImode STV
conversion need similar handling of debug insns?

Uros.

>
> H.J.
> 
> gcc/
>
> PR target/71549
> * config/i386/i386.c (timode_scalar_to_vector_candidate_p):
> Return true if debug insn has a variable in TImode register.
> (timode_remove_non_convertible_regs): Skip debug insn.
> (scalar_chain::convert_insn): Change return type to bool.
> (scalar_chain::add_insn): Don't check registers in debug insn.
> (dimode_scalar_chain::convert_insn): Change return type to bool
> and always return true.
> (timode_scalar_chain::convert_insn): Change return type to bool.
> Convert V1TImode register to SUBREG TImode in debug insn.  Return
> false if debug insn isn't converted.  Otherwise, return true.
> (scalar_chain::convert): Increment converted_insns only if
> convert_insn returns true.
>
> gcc/testsuite/
>
> PR target/71549
> * gcc.target/i386/pr71549.c: New test.
> ---
>  gcc/config/i386/i386.c  | 58 
> -
>  gcc/testsuite/gcc.target/i386/pr71549.c | 24 ++
>  2 files changed, 74 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 56a5b9c..e17fc53 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2845,6 +2845,16 @@ dimode_scalar_to_vector_candidate_p (rtx_insn *insn)
>  static bool
>  timode_scalar_to_vector_candidate_p (rtx_insn *insn)
>  {
> +  if (DEBUG_INSN_P (insn))
> +{
> +  /* If a variable is put in a TImode register, which may be
> +converted to V1TImode, we need to convert this debug insn.  */
> +  rtx val = PATTERN (insn);
> +  return (GET_MODE (val) == TImode
> + && GET_CODE (val) == VAR_LOCATION
> + && REG_P (PAT_VAR_LOCATION_LOC (val)));
> +}
> +
>rtx def_set = single_set (insn);
>
>if (!def_set)
> @@ -3012,7 +3022,12 @@ timode_remove_non_convertible_regs (bitmap candidates)
>
>EXECUTE_IF_SET_IN_BITMAP (candidates, 0, id, bi)
>  {
> -  rtx def_set = single_set (DF_INSN_UID_GET (id)->insn);
> +  rtx_insn *insn = DF_INSN_UID_GET (id)->insn;
> +  /* Debug insn isn't a SET insn.  */
> +  if (DEBUG_INSN_P (insn))
> +   continue;
> +
> +  rtx def_set = single_set (insn);
>rtx dest = SET_DEST (def_set);
>rtx src = SET_SRC (def_set);
>
> @@ -3111,7 +3126,7 @@ class scalar_chain
>void add_insn (bitmap candidates, unsigned insn_uid);
>void analyze_register_chain (bitmap candidates, df_ref ref);
>virtual void mark_dual_mode_def (df_ref def) = 0;
> -  virtual void convert_insn (rtx_insn *insn) = 0;
> +  virtual bool convert_insn (rtx_insn *insn) = 0;
>virtual void convert_registers () = 0;
>  };
>
> @@ -3123,7 +3138,7 @@ class dimode_scalar_chain : public scalar_chain
>void mark_dual_mode_def (df_ref def);
>rtx replace_with_subreg (rtx x, rtx reg, rtx subreg);
>void replace_with_subreg_in_insn (rtx_insn *insn, rtx reg, rtx subreg);
> -  void convert_insn (rtx_insn *insn);
> +  bool convert_insn (rtx_insn *insn);
>void convert_op (rtx *op, rtx_insn *insn);
>void convert_reg (unsigned regno);
>void make_vector_copies (unsigned regno);
> @@ -3139,7 +3154,7 @@ class timode_scalar_chain : public scalar_chain
>
>   private:
>void mark_dual_mode_def (df_ref def);
> -  void convert_insn (rtx_insn *insn);
> +  bool convert_insn (rtx_insn *insn);
>/* We don't convert registers to difference size.  */
>void convert_registers () {}
>  };
> @@ -3276,6 +3291,10 @@ scalar_chain::add_insn (bitmap candidates, unsigned 
> int insn_uid)
>bitmap_set_bit (insns, insn_uid);
>
>rtx_insn *insn = DF_INSN_UID_GET (insn_uid)->insn;
> +  /* Debug insn isn't a SET insn.  */
> +  if (DEBUG_INSN_P (insn))
> +return;
> +
>rtx def_set = single_set (insn);
>if (def_set && REG_P (SET_DEST (def_set))
>&& !HARD_REGISTER_P (SET_DEST (def_set)))
> @@ -3708,7 +3727,7 @@ dimode_scalar_chain::convert_op (rtx *op, rtx_insn 
> *insn)
>
>  /* Convert INSN to vector mode.  */
>
> -void
> +bool
>  dimode_scalar_chain::convert_insn (rtx_insn *insn)
>  {
>rtx def_set = single_set (insn);
> @@ -3788,13 +3807,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
>INSN_CODE (insn) = -1;
>recog_memoized (insn);
>df_insn_rescan (insn);
> +
> +  return true;
>  }
>
>  /* Convert INSN from TImode to V1T1mode.  */
>
> -void
> +bool
>  timode_scalar_chain::convert_insn (rtx_insn *insn)
>  {
> +  if 

Re: RFC: 2->2 combine patch

2016-06-20 Thread Kyrill Tkachov

Hi Segher,

On 17/06/16 01:07, Segher Boessenkool wrote:

On Fri, Jun 10, 2016 at 11:20:22AM +0200, Richard Biener wrote:

With the proposed cost change for vector construction we will end up
vectorizing the testcase in PR68961 again (on x86_64 and likely
on ppc64le as well after that target gets adjustments).  Currently
we can't optimize that away again noticing the direct overlap of
argument and return registers.  The obstackle is

(insn 7 4 8 2 (set (reg:V2DF 93)
 (vec_concat:V2DF (reg/v:DF 91 [ a ])
 (reg/v:DF 92 [ aa ])))
...
(insn 21 8 24 2 (set (reg:DI 97 [ D.1756 ])
 (subreg:DI (reg:TI 88 [ D.1756 ]) 0))
(insn 24 21 11 2 (set (reg:DI 100 [+8 ])
 (subreg:DI (reg:TI 88 [ D.1756 ]) 8))

which we eventually optimize to DFmode subregs of (reg:V2DF 93).

First of all simplify_subreg doesn't handle the subregs of a vec_concat
(easy fix below).

Then combine doesn't like to simplify the multi-use (it tries some
parallel it seems).

Combine will not do a 2->2 combination currently.  Say it is combining
A with a later B into C, and the result of A is used again later, then
it tries a parallel of A with C.  That usually does not match an insn for
the target.

If this were a 3->2 (or 4->2) combination, or A or C are no-op moves
(so that they will disappear later in combines), combine will break the
parallel into two and see if that matches.  We can in fact do that for
2->2 combinations as well: this removes a log_link (from A to B), so
combine cannot get into an infinite loop, even though it does not make
the number of RTL insns lower.

So I tried out the patch below.  It decreases code size on most targets
(mostly fixed length insn targets), and increases it a small bit on some
variable length insn targets (doing an op twice, instead of doing it once
and doing a move).  It looks to be all good there too, but there are so
many changes that it is almost impossible to really check.

So: can people try this out with their favourite benchmarks, please?


I hope to give this a run on AArch64 but I'll probably manage to get to it only 
next week.
In the meantime I've had a quick look at some SPEC2006 codegen on aarch64.
Some benchmarks decrease in size, others increase. One recurring theme I 
spotted is
shifts being repeatedly combined with arithmetic operations rather than being 
computed
once and reusing the result. For example:
lslx30, x15, 3
addx4, x5, x30
addx9, x7, x30
addx24, x8, x30
addx10, x0, x30
addx2, x22, x30

becoming (modulo regalloc fluctuations):
addx14, x2, x15, lsl 3
addx13, x22, x15, lsl 3
addx21, x4, x15, lsl 3
addx6, x0, x15, lsl 3
addx3, x30, x15, lsl 3

which, while saving one instruction can be harmful overall because the extra 
shift operation
in the arithmetic instructions can increase the latency of the instruction. I 
believe the aarch64
rtx costs should convey this information. Do you expect RTX costs to gate this 
behaviour?

Some occurrences that hurt code size look like:
cmpx8, x11, asr 5

being transformed into:
asrx12, x11, 5
cmpx12, x8, uxtw //zero-extend x8
with the user of the condition code inverted to match the change in order of 
operands
to the comparisons.
I haven't looked at the RTL dumps yet to figure out why this is happening, it 
could be a backend
RTL representation issue.

Kyrill




Segher


diff --git a/gcc/combine.c b/gcc/combine.c
index 6b5d000..2c99b4e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3933,8 +3933,6 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   && XVECLEN (newpat, 0) == 2
   && GET_CODE (XVECEXP (newpat, 0, 0)) == SET
   && GET_CODE (XVECEXP (newpat, 0, 1)) == SET
-  && (i1 || set_noop_p (XVECEXP (newpat, 0, 0))
- || set_noop_p (XVECEXP (newpat, 0, 1)))
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT





[patch, avr] Fix PR30417: Wrap -Tdata into %{!Tdata:...}.

2016-06-20 Thread Georg-Johann Lay
This patch allows to specify -Tdata and -Ttext on the command line for MCUs 
where the specs file sets these options.  For -mmcu=atmega88 for example, the 
respective specs reads:


*link_data_start:
-Tdata 0x800100

and the patch changes this to

*link_data_start:
%{!Tdata:-Tdata 0x800100}

Same for *link_text_start and -Ttext.

Ok for trunk and backport?

Johann


PR target/30417
* config/avr/gen-avr-mmcu-specs.c (print_mcu):
[*link_data_start]: Wrap -Tdata into %{!Tdata:...}.
[*link_text_start]: Wrap -Ttext into %{!Ttext:...}.

Index: config/avr/gen-avr-mmcu-specs.c
===
--- config/avr/gen-avr-mmcu-specs.c	(revision 237587)
+++ config/avr/gen-avr-mmcu-specs.c	(working copy)
@@ -242,12 +242,13 @@ bool is_arch = NULL == mcu->macro;
   fprintf (f, "*link_data_start:\n");
   if (mcu->data_section_start
   != arch->default_data_section_start)
-fprintf (f, "\t-Tdata 0x%lX", 0x80UL + mcu->data_section_start);
+fprintf (f, "\t%%{!Tdata:-Tdata 0x%lX}",
+ 0x80UL + mcu->data_section_start);
   fprintf (f, "\n\n");
 
   fprintf (f, "*link_text_start:\n");
   if (mcu->text_section_start != 0x0)
-fprintf (f, "\t-Ttext 0x%lX", 0UL + mcu->text_section_start);
+fprintf (f, "\t%%{!Ttext:-Ttext 0x%lX}", 0UL + mcu->text_section_start);
   fprintf (f, "\n\n");
 }
 


Re: [Patch AArch64] Add some more missing intrinsics

2016-06-20 Thread Richard Earnshaw (lists)
On 13/06/16 17:31, James Greenhalgh wrote:
> 
> Hi,
> 
> Inspired by Jiong's recent work, here are some more missing intrinsics,
> and a smoke test for each of them.
> 
> This patch covers:
> 
>   vcvt_n_f64_s64
>   vcvt_n_f64_u64
>   vcvt_n_s64_f64
>   vcvt_n_u64_f64
>   vcvt_f64_s64
>   vrecpe_f64
>   vcvt_f64_u64
>   vrecps_f64
> 
> Tested on aarch64-none-elf, and on an internal testsuite for Neon
> intrinsics.
> 
> Note that the new tests will ICE without the fixups in
> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00805.html
> 
> OK?
> 

OK, but please fix the nit that Kyrill highlighted.

R.

> Thanks,
> James
> 
> ---
> gcc/ChangeLog
> 
> 2016-06-10  James Greenhalgh  
> 
>   * config/aarch64/arm_neon.h (vcvt_n_f64_s64): New.
>   (vcvt_n_f64_u64): Likewise.
>   (vcvt_n_s64_f64): Likewise.
>   (vcvt_n_u64_f64): Likewise.
>   (vcvt_f64_s64): Likewise.
>   (vrecpe_f64): Likewise.
>   (vcvt_f64_u64): Likewise.
>   (vrecps_f64): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2016-06-10  James Greenhalgh  
> 
>   * gcc.target/aarch64/vcvt_f64_1.c: New.
>   * gcc.target/aarch64/vcvt_n_f64_1.c: New.
>   * gcc.target/aarch64/vrecp_f64_1.c: New.
> 
> 
> 0001-Patch-AArch64-Add-some-more-missing-intrinsics.patch
> 
> 
> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index f70b6d3..2f90938 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -12447,6 +12447,20 @@ vcvt_n_f32_u32 (uint32x2_t __a, const int __b)
>return __builtin_aarch64_ucvtfv2si_sus (__a, __b);
>  }
>  
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vcvt_n_f64_s64 (int64x1_t __a, const int __b)
> +{
> +  return (float64x1_t)
> +{ __builtin_aarch64_scvtfdi (vget_lane_s64 (__a, 0), __b) };
> +}
> +
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vcvt_n_f64_u64 (uint64x1_t __a, const int __b)
> +{
> +  return (float64x1_t)
> +{ __builtin_aarch64_ucvtfdi_sus (vget_lane_u64 (__a, 0), __b) };
> +}
> +
>  __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
>  vcvtq_n_f32_s32 (int32x4_t __a, const int __b)
>  {
> @@ -12509,6 +12523,20 @@ vcvt_n_u32_f32 (float32x2_t __a, const int __b)
>return __builtin_aarch64_fcvtzuv2sf_uss (__a, __b);
>  }
>  
> +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
> +vcvt_n_s64_f64 (float64x1_t __a, const int __b)
> +{
> +  return (int64x1_t)
> +{ __builtin_aarch64_fcvtzsdf (vget_lane_f64 (__a, 0), __b) };
> +}
> +
> +__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
> +vcvt_n_u64_f64 (float64x1_t __a, const int __b)
> +{
> +  return (uint64x1_t)
> +{ __builtin_aarch64_fcvtzudf_uss (vget_lane_f64 (__a, 0), __b) };
> +}
> +
>  __extension__ static __inline int32x4_t __attribute__ ((__always_inline__))
>  vcvtq_n_s32_f32 (float32x4_t __a, const int __b)
>  {
> @@ -12571,6 +12599,18 @@ vcvt_f32_u32 (uint32x2_t __a)
>return __builtin_aarch64_floatunsv2siv2sf ((int32x2_t) __a);
>  }
>  
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vcvt_f64_s64 (int64x1_t __a)
> +{
> +  return (float64x1_t) { vget_lane_s64 (__a, 0) };
> +}
> +
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vcvt_f64_u64 (uint64x1_t __a)
> +{
> +  return (float64x1_t) { vget_lane_u64 (__a, 0) };
> +}
> +
>  __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
>  vcvtq_f32_s32 (int32x4_t __a)
>  {
> @@ -20659,6 +20699,12 @@ vrecpe_f32 (float32x2_t __a)
>return __builtin_aarch64_frecpev2sf (__a);
>  }
>  
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vrecpe_f64 (float64x1_t __a)
> +{
> +  return (float64x1_t) { vrecped_f64 (vget_lane_f64 (__a, 0)) };
> +}
> +
>  __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
>  vrecpeq_f32 (float32x4_t __a)
>  {
> @@ -20691,6 +20737,13 @@ vrecps_f32 (float32x2_t __a, float32x2_t __b)
>return __builtin_aarch64_frecpsv2sf (__a, __b);
>  }
>  
> +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
> +vrecps_f64 (float64x1_t __a, float64x1_t __b)
> +{
> +  return (float64x1_t) { vrecpsd_f64  (vget_lane_f64 (__a, 0),
> +vget_lane_f64 (__b, 0)) };
> +}
> +
>  __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
>  vrecpsq_f32 (float32x4_t __a, float32x4_t __b)
>  {
> diff --git a/gcc/testsuite/gcc.target/aarch64/vcvt_f64_1.c 
> b/gcc/testsuite/gcc.target/aarch64/vcvt_f64_1.c
> new file mode 100644
> index 000..b7ee7af
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vcvt_f64_1.c
> @@ -0,0 +1,48 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#include "arm_neon.h"
> +
> +/* For each of these intrinsics, we're 

PING^3: [PATCH] PR33661 Fix problem with register asm in templates

2016-06-20 Thread Andreas Krebbel
Hi Jason,

could you please have a look?

https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00904.html

Thanks!

-Andreas-



Re: [PATCH] Giant concepts patch

2016-06-20 Thread Jason Merrill
On Fri, Mar 25, 2016 at 1:33 AM, Andrew Sutton
 wrote:
> I'll just leave this here...
>
> This patch significantly improves performance with concepts (i.e.,
> makes it actually usable for real systems) and improves the
> specificity of diagnostics when constraints fail.
>
> Unfortunately, this isn't easily submittable in small pieces because
> it completely replaces most of the core processing routines for
> constraints, including (essentially) a complete rewrite of logic.cc
> and the diagnostics in constraint.cc. More perfective work could be
> done related to diagnostics, but this needs to be applied first.
>
> As part of the patch, I added timevars for constraint satisfaction and
> subsumption. In template-heavy coe (~80KLOC), I'm seeing satisfaction
> account for ~6% of compilation time and subsumption ~2%. Template
> instantiation remains ~35%, but I think there's still room for
> improvement in concepts. It just requires experimentation.
>
> Tests involving significant number of disjunctions have yet to
> register > 1% of compilation time spent in subsumption, but I'm still
> testing.

Thanks, I've been working on integrating this patch, hoping to have it
in for 6.2.  Have you done more work on it since you sent this out?

A few issues:

I've run into some trouble building cmcstl2: declarator requirements
on a function can lead to constraints that tsubst_constraint doesn't
handle.  What was your theory of only handling a few _CONSTR codes
there?  This is blocking me from checking in the patch.

Adding gcc_unreachable to the ARGUMENT_PACK_SELECT handling in concept
arg hash/compare doesn't seem to break anything in either the GCC
testsuite or your stl2.  Do you have a testcase where that code is
still needed?

> Also, it might be worth noting that partial specialization of variable
> templates is currently broken. We don't seem to be emitting template
> arguments as part of the mangled name, leading to lots and lots of
> late redefinition errors.

This should be fixed now.

Jason


Re: [Patch AArch64] Fixup to fcvt patterns added in r237200

2016-06-20 Thread Richard Earnshaw (lists)
On 10/06/16 13:29, James Greenhalgh wrote:
> 
> Hi,
> 
> My autotester picked up some issues with the vcvt{ds}_n_* intrinsics
> added in r237200.
> 
> The iterators in this pattern do not resolve, as they have not been
> explicitly tied to the mode iterator (rather than the code iterator)
> used by the pattern.
> 
> This fixup adds the attribute tags, allowing the patterns to work
> correctly.
> 
> Additionally, the types assigned to these instructions were wrong, and
> would permit the immediate operand to be in a register. This will then
> develop in to an ICE as the patterns require an immediate operand, and so
> won't match. The ICE can be exposed by writing a wrapping function around
> the vcvtd_n_* intrinsics, which forces the immediate operand to a register.
> We have the infrastructure to error to the user rather than ICEing, but it
> needs some different types, which this patch adds.
> 
> I've checked this with an aarch64-none-elf test run, and run it through
> several rounds of my autotester for aarch64-none-elf and
> aarch64_be-none-elf.
> 
> OK?
> 

OK.

R.

> Thanks,
> James
> 
> ---
> 2016-06-10  James Greenhalgh  
> 
>   * config/aarch64/aarch64.md
>   (3): Add attributes to
>   iterators.
>   (3): Likewise.  Correct
>   attributes.
>   * config/aarch64/aarch64-builtins.c
>   (aarch64_types_binop_uss_qualifiers): Delete.
>   (TYPES_BINOP_USS): Likewise.
>   (aarch64_types_binop_sus_qualifiers): Likewise.
>   (TYPES_BINOP_SUS): Likewise.
>   (aarch64_types_fcvt_from_unsigned_qualifiers): New.
>   (TYPES_FCVTIMM_SUS): Likewise.
>   * config/aarch64/aarch64-simd-builtins.def (scvtf): Use SHIFTIMM
>   rather than BINOP.
>   (ucvtf): Use FCVTIMM_SUS rather than BINOP_SUS.
>   (fcvtzs): Use SHIFTIMM rather than BINOP.
>   (fcvtzu): Use SHIFTIMM_USS rather than BINOP_USS.
> 
> 
> 0001-Patch-AArch64-Fixup-to-fcvt-patterns-added-in-r23720.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 262ea1c..6b90b2a 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -139,14 +139,6 @@ aarch64_types_binop_ssu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_none, qualifier_unsigned };
>  #define TYPES_BINOP_SSU (aarch64_types_binop_ssu_qualifiers)
>  static enum aarch64_type_qualifiers
> -aarch64_types_binop_uss_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_unsigned, qualifier_none, qualifier_none };
> -#define TYPES_BINOP_USS (aarch64_types_binop_uss_qualifiers)
> -static enum aarch64_type_qualifiers
> -aarch64_types_binop_sus_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_none, qualifier_unsigned, qualifier_none };
> -#define TYPES_BINOP_SUS (aarch64_types_binop_sus_qualifiers)
> -static enum aarch64_type_qualifiers
>  aarch64_types_binopp_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_poly, qualifier_poly, qualifier_poly };
>  #define TYPES_BINOPP (aarch64_types_binopp_qualifiers)
> @@ -181,6 +173,10 @@ 
> aarch64_types_shift_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_none, qualifier_immediate };
>  #define TYPES_SHIFTIMM_USS (aarch64_types_shift_to_unsigned_qualifiers)
>  static enum aarch64_type_qualifiers
> +aarch64_types_fcvt_from_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> +  = { qualifier_none, qualifier_unsigned, qualifier_immediate };
> +#define TYPES_FCVTIMM_SUS (aarch64_types_fcvt_from_unsigned_qualifiers)
> +static enum aarch64_type_qualifiers
>  aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
>  #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-builtins.def
> index 1332734..02d465b 100644
> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> @@ -447,10 +447,10 @@
>BUILTIN_VSDQ_HSI (QUADOP_LANE, sqrdmlsh_laneq, 0)
>  
>/* Implemented by <*><*>3.  */
> -  BUILTIN_VSDQ_SDI (BINOP, scvtf, 3)
> -  BUILTIN_VSDQ_SDI (BINOP_SUS, ucvtf, 3)
> -  BUILTIN_VALLF (BINOP, fcvtzs, 3)
> -  BUILTIN_VALLF (BINOP_USS, fcvtzu, 3)
> +  BUILTIN_VSDQ_SDI (SHIFTIMM, scvtf, 3)
> +  BUILTIN_VSDQ_SDI (FCVTIMM_SUS, ucvtf, 3)
> +  BUILTIN_VALLF (SHIFTIMM, fcvtzs, 3)
> +  BUILTIN_VALLF (SHIFTIMM_USS, fcvtzu, 3)
>  
>/* Implemented by aarch64_rsqrte.  */
>BUILTIN_VALLF (UNOP, rsqrte, 0)
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 926f2da..b3ae42b 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4640,8 +4640,8 @@
>FCVT_F2FIXED))]
>""
>"@
> -   \t%0, %1, #%2
> -   \t%0, %1, #%2"
> +   \t%0, %1, #%2
> +   \t%0, %1, #%2"
>[(set_attr "type" "f_cvtf2i, neon_fp_to_int_")
> 

Re: [AArch64] Give some new costs for Cortex-A57 floating-point operations

2016-06-20 Thread Richard Earnshaw (lists)
On 03/06/16 09:35, James Greenhalgh wrote:
> 
> Hi,
> 
> This patch rebases the floating-point cost table for Cortex-A57 to be
> relative to the cost of a floating-point move. This in response to this
> feedback from Richard Sandiford [2] on Ramana's patch to calls.c [1] from
> 2014:
> 
>   I think this is really a bug in the backend.  The backend is assigning a
>   cost of COSTS_N_INSNS (3) to a floating-point constant not because the
>   constant itself is expensive -- it's actually as cheap as a register
>   in this context -- but because the backend considers floating-point
>   moves to be 3 times more expensive than cheap integer moves.
> 
> The argument is that a move in mode X should be treated with cost
> COSTS_N_INSNS (1), and other instructions should have a cost relative to
> that move. For example, in this patch we say that instructions building a
> floating-point constant are the same cost as a floating-point register to
> register move. Fixing this fixes the issue Ramana was seeing, in a way
> consistent with what other back-ends do.
> 
> This patch gives a small improvement to Spec2000FP on a Cortex-A57
> platform.
> 
> Bootstrapped on aarch64-none-linux-gnu with no issues.
> 
> OK?
> 

OK.

R.
> Thanks,
> James
> 
> ---
> 2016-06-03  James Greenhalgh  
> 
>   * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): Make FP
>   costs relative to the cost of a register move.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00136.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00391.html
> 
> 
> 0001-AArch64-Give-some-new-costs-for-Cortex-A57-floating-.patch
> 
> 
> diff --git a/gcc/config/arm/aarch-cost-tables.h 
> b/gcc/config/arm/aarch-cost-tables.h
> index c971b30..5f42253 100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -294,35 +294,35 @@ const struct cpu_cost_table cortexa57_extra_costs =
>{
>  /* FP SFmode */
>  {
> -  COSTS_N_INSNS (17),  /* div.  */
> -  COSTS_N_INSNS (5),   /* mult.  */
> -  COSTS_N_INSNS (9),   /* mult_addsub. */
> -  COSTS_N_INSNS (9),   /* fma.  */
> -  COSTS_N_INSNS (4),   /* addsub.  */
> -  COSTS_N_INSNS (2),   /* fpconst. */
> -  COSTS_N_INSNS (2),   /* neg.  */
> -  COSTS_N_INSNS (2),   /* compare.  */
> -  COSTS_N_INSNS (4),   /* widen.  */
> -  COSTS_N_INSNS (4),   /* narrow.  */
> -  COSTS_N_INSNS (4),   /* toint.  */
> -  COSTS_N_INSNS (4),   /* fromint.  */
> -  COSTS_N_INSNS (4)/* roundint.  */
> +  COSTS_N_INSNS (6),  /* div.  */
> +  COSTS_N_INSNS (1),   /* mult.  */
> +  COSTS_N_INSNS (2),   /* mult_addsub.  */
> +  COSTS_N_INSNS (2),   /* fma.  */
> +  COSTS_N_INSNS (1),   /* addsub.  */
> +  0,/* fpconst.  */
> +  0,/* neg.  */
> +  0,/* compare.  */
> +  COSTS_N_INSNS (1),   /* widen.  */
> +  COSTS_N_INSNS (1),   /* narrow.  */
> +  COSTS_N_INSNS (1),   /* toint.  */
> +  COSTS_N_INSNS (1),   /* fromint.  */
> +  COSTS_N_INSNS (1)/* roundint.  */
>  },
>  /* FP DFmode */
>  {
> -  COSTS_N_INSNS (31),  /* div.  */
> -  COSTS_N_INSNS (5),   /* mult.  */
> -  COSTS_N_INSNS (9),   /* mult_addsub.  */
> -  COSTS_N_INSNS (9),   /* fma.  */
> -  COSTS_N_INSNS (4),   /* addsub.  */
> -  COSTS_N_INSNS (2),   /* fpconst.  */
> -  COSTS_N_INSNS (2),   /* neg.  */
> -  COSTS_N_INSNS (2),   /* compare.  */
> -  COSTS_N_INSNS (4),   /* widen.  */
> -  COSTS_N_INSNS (4),   /* narrow.  */
> -  COSTS_N_INSNS (4),   /* toint.  */
> -  COSTS_N_INSNS (4),   /* fromint.  */
> -  COSTS_N_INSNS (4)/* roundint.  */
> +  COSTS_N_INSNS (11),  /* div.  */
> +  COSTS_N_INSNS (1),   /* mult.  */
> +  COSTS_N_INSNS (2),   /* mult_addsub.  */
> +  COSTS_N_INSNS (2),   /* fma.  */
> +  COSTS_N_INSNS (1),   /* addsub.  */
> +  0,/* fpconst.  */
> +  0,/* neg.  */
> +  0,/* compare.  */
> +  COSTS_N_INSNS (1),   /* widen.  */
> +  COSTS_N_INSNS (1),   /* narrow.  */
> +  COSTS_N_INSNS (1),   /* toint.  */
> +  COSTS_N_INSNS (1),   /* fromint.  */
> +  COSTS_N_INSNS (1)/* roundint.  */
>  }
>},
>/* Vector */
> 



[Ada] Reimplementation of type invariants

2016-06-20 Thread Arnaud Charlet
This patch prevents the insertion of the invariant procedure declaration and
body when the context is a generic unit. This ensures that generated code does
not permiate the template.


-- Source --


--  tester.ads

package Tester is
   type Type_Id is
 (Ext_1_Id,
  Ext_1_FV_Id,
  Ext_2_Id,
  Ext_3_Id,
  Ext_4_Id,
  Ext_4_FV_Id,
  Ext_5_Id,
  Ext_6_Id,
  Ext_6_FV_Id,
  Ext_7_Id,
  Ext_8_Id,
  Iface_1_Id,
  Iface_2_Id,
  Iface_3_Id,
  Iface_4_Id,
  Par_1_Id,
  Par_2_FV_Id,
  Par_3_Id,
  Par_4_Id,
  Par_4_FV_Id,
  Prot_1_FV_Id,
  Prot_2_Id,
  Prot_3_Id,
  Prot_3_FV_Id,
  Synch_1_Id,
  Synch_2_Id,
  Tag_1_Id,
  Tag_2_Id,
  Tag_3_Id,
  Tag_4_Id,
  Tag_4_FV_Id,
  Tag_5_Id,
  Tag_6_Id,
  Tag_7_Id,
  Tag_8_Id,
  Tag_9_Id,
  Tag_10_Id,
  Tag_11_Id,
  Tag_12_Id,
  Tag_13_Id,
  Tag_14_Id,
  Tag_15_Id,
  Tag_15_FV_Id,
  Tag_16_Id,
  Tag_17_Id,
  Tag_18_Id,
  Tag_19_Id,
  Tag_20_Id,
  Tag_20_FV_Id,
  Tag_21_Id,
  Tag_22_Id,
  Tag_23_Id,
  Tag_24_Id,
  Tag_24_FV_Id,
  Tag_25_Id,
  Tag_26_Id,
  Tag_27_Id,
  Tag_28_Id,
  Task_1_Id,
  Task_2_Id,
  Task_2_FV_Id,
  Untag_1_Id,
  Untag_2_Id,
  Untag_3_Id,
  Untag_4_Id,
  Untag_5_Id,
  Untag_6_Id,
  Untag_7_Id,
  Untag_8_Id,
  Untag_9_Id);

   type Results is array (Type_Id) of Boolean;

   function Mark (Typ : Type_Id) return Boolean;
   --  Mark the result for a particular type as verified. The function always
   --  returns True.

   procedure Reset_Results;
   --  Reset the internally kept result state

   procedure Test_Results (Test_Id : String; Exp : Results);
   --  Ensure that the internally kept result state agrees with expected
   --  results Exp. Emit an error if this is not the case.
end Tester;

--  tester.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Tester is
   State : Results;

   --
   -- Mark --
   --

   function Mark (Typ : Type_Id) return Boolean is
   begin
  State (Typ) := True;
  return True;
   end Mark;

   ---
   -- Reset_Results --
   ---

   procedure Reset_Results is
   begin
  State := (others => False);
   end Reset_Results;

   --
   -- Test_Results --
   --

   procedure Test_Results (Test_Id : String; Exp : Results) is
  Exp_Val   : Boolean;
  Posted: Boolean := False;
  State_Val : Boolean;

   begin
  for Index in Results'Range loop
 Exp_Val   := Exp (Index);
 State_Val := State (Index);

 if State_Val /= Exp_Val then
if not Posted then
   Posted := True;
   Put_Line (Test_Id & ": ERROR");
end if;

Put_Line ("  Expected: " & Exp_Val'Img & " for " & Index'Img);
Put_Line ("  Got:  " & State_Val'Img);
 end if;
  end loop;

  if not Posted then
 Put_Line (Test_Id & ": OK");
  end if;
   end Test_Results;
end Tester;

--  gen_invariants.ads

with Tester; use Tester;

generic
package Gen_Invariants is
   type Untag_1 is private
 with Type_Invariant => Mark (Untag_1_Id);

private
   type Untag_1 is null record;

   Obj_1 : Untag_1;
end Gen_Invariants;

-
-- Compilation --
-

$ gcc -c -gnata -gnatDG gen_invariants.ads
$ grep "Invariant" gen_invariants.ads.dg

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-06-20  Hristian Kirtchev  

* exp_ch7.adb (Build_Invariant_Procedure_Body):
Always install the scope of the invariant procedure
in order to produce better error messages. Do not
insert the body when the context is a generic unit.
(Build_Invariant_Procedure_Declaration): Perform minimal
decoration of the invariant procedure and its formal parameter
in case they are not analyzed.  Do not insert the declaration
when the context is a generic unit.

Index: exp_ch7.adb
===
--- exp_ch7.adb (revision 237598)
+++ exp_ch7.adb (working copy)
@@ -4622,7 +4622,16 @@
 
   Set_Ghost_Mode_From_Entity (Work_Typ);
 
+  --  Emulate the environment of the invariant procedure by installing
+  --  its scope and formal parameters. Note that this is not need, but
+  --  having the scope of the invariant procedure installed helps with
+  --  the detection of invariant-related errors.
+
+  Push_Scope (Proc_Id);
+  Install_Formals (Proc_Id);
+
   Obj_Id := First_Formal (Proc_Id);
+  pragma Assert (Present (Obj_Id));
 
   --  The "partial" invariant procedure verifies the invariants of the
   --  partial view only.
@@ -4631,14 +4640,6 @@
  pragma Assert (Present (Priv_Typ));
  

[Ada] Adapt treatment of inherited classwide pre/post to GNATprove

2016-06-20 Thread Arnaud Charlet
In GNATprove mode, inherited classwide pre/post are copied to the
overriding subprogram declaration, so that GNATprove can find them to
verify Liskov Substitution Principle on SPARK code. The copied pre/post
are not turned into pragma checks anymore in GNATprove mode, so that they
are added to the Contract node of the overriding subprogram entity, which
makes it easier to deal with in GNATprove.

The type of the call node is also set to the appropriate type after the
function has been specialized in the copied pragma, in both GNATprove
mode and normal mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-06-20  Yannick Moy  

* sem_prag.adb, sem_prag.ads (Build_Pragma_Check_Equivalent):
Add parameter Keep_Pragma_Id to optionally keep
the identifier of the pragma instead of converting
to pragma Check. Also set type of new function call
appropriately.  (Collect_Inherited_Class_Wide_Conditions):
Call Build_Pragma_Check_Equivalent with the new parameter
Keep_Pragma_Id set to True to keep the identifier of the copied
pragma.
* sinfo.ads: Add comment.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 237598)
+++ sem_prag.adb(working copy)
@@ -26277,9 +26277,10 @@
---
 
function Build_Pragma_Check_Equivalent
- (Prag : Node_Id;
-  Subp_Id  : Entity_Id := Empty;
-  Inher_Id : Entity_Id := Empty) return Node_Id
+ (Prag   : Node_Id;
+  Subp_Id: Entity_Id := Empty;
+  Inher_Id   : Entity_Id := Empty;
+  Keep_Pragma_Id : Boolean := False) return Node_Id
is
   Map : Elist_Id;
   --  List containing the following mappings
@@ -26361,6 +26362,15 @@
& "for

[Ada] Always consider Linker_Options from package System

2016-06-20 Thread Arnaud Charlet
On full runtimes, this was always the case.  On restricted one, force system
to be in the closer of the program.
No test for full runtimes (as no behaviour change).

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-06-20  Tristan Gingold  

* make.adb (Check_Standard_Library): Consider system.ads
if s-stalib.adb is not available.
* gnatbind.adb (Add_Artificial_ALI_File): New procedure extracted from
gnatbind.

Index: make.adb
===
--- make.adb(revision 237595)
+++ make.adb(working copy)
@@ -84,8 +84,11 @@
--  Make control characters visible
 
Standard_Library_Package_Body_Name : constant String := "s-stalib.adb";
-   --  Every program depends on this package, that must then be checked,
-   --  especially when -f and -a are used.
+   System_Package_Spec_Name : constant String := "system.ads";
+   --  Every program depends on one of these packages: usually the first one,
+   --  or if Supress_Standard_Library is true on the second one. The dependency
+   --  is not always explicit and considering it is important when -f and -a
+   --  are used.
 
type Sigint_Handler is access procedure;
pragma Convention (C, Sigint_Handler);
@@ -2701,39 +2704,43 @@
   begin
  Need_To_Check_Standard_Library := False;
 
+ Name_Len := 0;
+
  if not Targparm.Suppress_Standard_Library_On_Target then
-declare
-   Sfile  : File_Name_Type;
-   Add_It : Boolean := True;
+Add_Str_To_Name_Buffer (Standard_Library_Package_Body_Name);
+ else
+Add_Str_To_Name_Buffer (System_Package_Spec_Name);
+ end if;
 
-begin
-   Name_Len := 0;
-   Add_Str_To_Name_Buffer (Standard_Library_Package_Body_Name);
-   Sfile := Name_Enter;
+ declare
+Sfile  : File_Name_Type;
+Add_It : Boolean := True;
 
-   --  If we have a special runtime, we add the standard
-   --  library only if we can find it.
+ begin
+Sfile := Name_Enter;
 
-   if RTS_Switch then
-  Add_It := Full_Source_Name (Sfile) /= No_File;
-   end if;
+--  If we have a special runtime, we add the standard library only
+--  if we can find it.
 
-   if Add_It then
-  if not Queue.Insert
-   ((Format  => Format_Gnatmake,
- File=> Sfile,
- Unit=> No_Unit_Name,
- Project => No_Project,
- Index   => 0,
- Sid => No_Source))
-  then
- if Is_In_Obsoleted (Sfile) then
-Executable_Obsolete := True;
- end if;
+if RTS_Switch then
+   Add_It := Full_Source_Name (Sfile) /= No_File;
+end if;
+
+if Add_It then
+   if not Queue.Insert
+((Format  => Format_Gnatmake,
+  File=> Sfile,
+  Unit=> No_Unit_Name,
+  Project => No_Project,
+  Index   => 0,
+  Sid => No_Source))
+   then
+  if Is_In_Obsoleted (Sfile) then
+ Executable_Obsolete := True;
   end if;
end if;
-end;
- end if;
+end if;
+ end;
   end Check_Standard_Library;
 
   ---
Index: gnatbind.adb
===
--- gnatbind.adb(revision 237595)
+++ gnatbind.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -89,6 +89,9 @@
--  Table to record the sources in the closure, to avoid duplications. Used
--  only with switch -R.
 
+   procedure Add_Artificial_ALI_File (Name : String);
+   --  Artificially add ALI file Name in the closure.
+
function Gnatbind_Supports_Auto_Init return Boolean;
--  Indicates if automatic initialization of elaboration procedure
--  through the constructor 

[Ada] Handling of all-digits host names

2016-06-20 Thread Arnaud Charlet
In Get_Host_By_Name, do not treat a strings consisting of digits only
as an IP address whose lookup should actually be done using
Get_Host_By_Address.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-06-20  Thomas Quinot  

* g-socket.adb (Is_IP_Address): A string consisting in digits only is
not a dotted quad.

Index: g-socket.adb
===
--- g-socket.adb(revision 237595)
+++ g-socket.adb(working copy)
@@ -150,7 +150,7 @@
--  Output an array of inet address components in hex or decimal mode
 
function Is_IP_Address (Name : String) return Boolean;
-   --  Return true when Name is an IP address in standard dot notation
+   --  Return true when Name is an IPv4 address in dotted quad notation
 
procedure Netdb_Lock;
pragma Inline (Netdb_Lock);
@@ -996,7 +996,8 @@
 
function Get_Host_By_Name (Name : String) return Host_Entry_Type is
begin
-  --  Detect IP address name and redirect to Inet_Addr
+  --  If the given name actually is the string representation of
+  --  an IP address, use Get_Host_By_Address instead.
 
   if Is_IP_Address (Name) then
  return Get_Host_By_Address (Inet_Addr (Name));
@@ -1503,16 +1504,37 @@
---
 
function Is_IP_Address (Name : String) return Boolean is
+  Dots : Natural := 0;
begin
+  --  Perform a cursory check for a dotted quad: we must have 1 to 3
+  --  dots, and there must be at least one digit around each.
+
   for J in Name'Range loop
- if Name (J) /= '.'
-   and then Name (J) not in '0' .. '9'
- then
+ if Name (J) = '.' then
+
+--  Check that the dot is not in first or last position, and
+--  that it is followed by a digit. Note that we already know
+--  that it is preceded by a digit, or we would have returned
+--  earlier on.
+
+if J in Name'First + 1 .. Name'Last - 1
+  and then Name (J + 1) in '0' .. '9'
+then
+   Dots := Dots + 1;
+
+else
+
+   --  Definitely not a proper dotted quad
+
+   return False;
+end if;
+
+ elsif Name (J) not in '0' .. '9' then
 return False;
  end if;
   end loop;
 
-  return True;
+  return Dots in 1 .. 3;
end Is_IP_Address;
 
-


Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.

2016-06-20 Thread Bernd Schmidt

On 06/20/2016 02:19 PM, Dominik Vogt wrote:


+/* PR/50938: Check that alloca () reserves the correct amount of stack space.
+ */


Same here really, even if it's only a test.


In this case, the line gets too long with "  */" appended.


In that case we wrap before the last word.


Bernd



Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.

2016-06-20 Thread Dominik Vogt
On Wed, Jun 08, 2016 at 01:21:09PM +0200, Bernd Schmidt wrote:
> On 05/25/2016 03:32 PM, Dominik Vogt wrote:
> > * explow.c (round_push): Use know adjustment.
> > (allocate_dynamic_stack_space): Pass known adjustment to round_push.
> >gcc/testsuite/ChangeLog
> >
> 
> I was thinking about whether it would be possible/desirable to
> eliminate the double add entirely, but I couldn't find a way to
> structure the code in a way that seems better than what you have.
> So, ...
> 
> > /* Round the size of a block to be pushed up to the boundary required
> >-   by this machine.  SIZE is the desired size, which need not be constant.  
> >*/
> >+   by this machine.  SIZE is the desired size, which need not be constant.
> >+   ALREADY_ADDED is the number of units that have already been added to SIZE
> >+   for other alignment reasons.
> >+*/
> 
> The */ goes on the last line of the comment.

No problem.

> >+/* PR/50938: Check that alloca () reserves the correct amount of stack 
> >space.
> >+ */
> 
> Same here really, even if it's only a test.

In this case, the line gets too long with "  */" appended.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[avr, 6, 5, 4.9, committed] Backported PR target/71103

2016-06-20 Thread Georg-Johann Lay

Applied backports to:

v6: https://gcc.gnu.org/r237591
v5: https://gcc.gnu.org/r237593
v4.9: https://gcc.gnu.org/r237594

Johann


gcc/
Backport from 2016-06-20 trunk r237589, r236558.

PR target/71103
* config/avr/avr.md (movqi): Handle loading subreg:qi (const,
symbol_ref,label_ref).

gcc/testsuite/
Backport from 2016-06-20 trunk r237589, r236558.

PR target/71103
* gcc.target/avr/pr71103.c: New test.
* gcc.target/avr/torture/pr71103-2.c: New test.

Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 237589)
+++ config/avr/avr.md	(working copy)
@@ -641,6 +641,21 @@ (define_expand "mov"
 if (avr_mem_flash_p (dest))
   DONE;
 
+if (QImode == mode
+&& SUBREG_P (src)
+&& CONSTANT_ADDRESS_P (SUBREG_REG (src)))
+{
+// store_bitfield may want to store a SYMBOL_REF or CONST in a
+// structure that's represented as PSImode.  As the upper 16 bits
+// of PSImode cannot be expressed as an HImode subreg, the rhs is
+// decomposed into QImode (word_mode) subregs of SYMBOL_REF,
+// CONST or LABEL_REF; cf. PR71103.
+
+rtx const_addr = SUBREG_REG (src);
+operands[1] = src = copy_rtx (src);
+SUBREG_REG (src) = copy_to_mode_reg (GET_MODE (const_addr), const_addr);
+  }
+
 /* One of the operands has to be in a register.  */
 if (!register_operand (dest, mode)
 && !reg_or_0_operand (src, mode))
Index: testsuite/gcc.target/avr/pr71103.c
===
--- testsuite/gcc.target/avr/pr71103.c	(nonexistent)
+++ testsuite/gcc.target/avr/pr71103.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct ResponseStruct{
+unsigned char responseLength;
+char *response;
+};
+
+static char response[5];
+struct ResponseStruct something(){
+struct ResponseStruct returnValue;
+returnValue.responseLength = 5;
+returnValue.response = response;
+return returnValue;
+}
+
Index: testsuite/gcc.target/avr/torture/pr71103-2.c
===
--- testsuite/gcc.target/avr/torture/pr71103-2.c	(nonexistent)
+++ testsuite/gcc.target/avr/torture/pr71103-2.c	(working copy)
@@ -0,0 +1,118 @@
+/* Use -g0 so that this test case doesn't just fail because
+   of PR52472.  */
+
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99 -g0" } */
+
+struct S12
+{
+  char c;
+  const char *p;
+};
+
+struct S12f
+{
+  char c;
+  struct S12f (*f)(void);
+};
+
+struct S12labl
+{
+  char c;
+  void **labl;
+};
+
+struct S121
+{
+  char c;
+  const char *p;
+  char d;
+};
+
+const char str[5] = "abcd";
+
+struct S12 test_S12_0 (void)
+{
+  struct S12 s;
+  s.c = 'A';
+  s.p = str;
+  return s;
+}
+
+struct S12 test_S12_4 (void)
+{
+  struct S12 s;
+  s.c = 'A';
+  s.p = str + 4;
+  return s;
+}
+
+struct S12f test_S12f (void)
+{
+  struct S12f s;
+  s.c = 'A';
+  s.f = test_S12f;
+  return s;
+}
+
+struct S121 test_S121 (void)
+{
+  struct S121 s;
+  s.c = 'c';
+  s.p = str + 4;
+  s.d = 'd';
+  return s;
+}
+
+extern void use_S12lab (struct S12labl*);
+
+struct S12labl test_S12lab (void)
+{
+  struct S12labl s;
+labl:;
+  s.c = 'A';
+  s.labl = &
+  return s;
+}
+
+#ifdef __MEMX
+
+struct S13
+{
+  char c;
+  const __memx char *p;
+};
+
+const __memx char str_x[] = "abcd";
+
+struct S13 test_S13_0 (void)
+{
+  struct S13 s;
+  s.c = 'A';
+  s.p = str_x;
+  return s;
+}
+
+struct S13 test_S13_4a (void)
+{
+  struct S13 s;
+  s.c = 'A';
+  s.p = str_x + 4;
+  return s;
+}
+
+#ifdef __FLASH1
+
+const __flash1 char str_1[] = "abcd";
+
+struct S13 test_13_4b (void)
+{
+  struct S13 s;
+  s.c = 'A';
+  s.p = str_1 + 4;
+  return s;
+}
+
+#endif /* have __flash1 */
+#endif /* have __memx */
+


[Patch, Fortran] PR71194 - Fix ICE with pointer assignment

2016-06-20 Thread Tobias Burnus
Dear all,

the issue occurs only if the RHS of a pointer assignment is a function and
the ICE is only triggered when a rank remapping is needed.

gfc_conv_expr_descriptor calls for a expr2 gfc_conv_procedure_call, which
sets "se.expr" to NULL_TREE - and the code later tries to access it.

The code correctly sets rse.expr to "tmp", but that does not help as all
actions were wrongly done on lse before. Solution: Stuff the RHS expr2 into
rse not into lse.

Build and regtested* on x86-64-gnu-linux.
OK for the trunk?

Tobias

(* gfortran.dg/graphite/pr68279.f90 fails but is a known PR,
gfortran.dg/vect/vect-8.f90 fails but not only for me, and
gfortran.dg/guality/pr41558.f90 never worked on that system)
	PR fortran/71194
	* trans-expr.c (gfc_trans_pointer_assignment): Correctly handle
	RHS pointer functions.

	PR fortran/71194
	* gfortran.dg/pointer_remapping_10.f90 | 46 ++

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 8f84712..b5731aa 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -7934,11 +7934,11 @@ gfc_trans_pointer_assignment (gfc_expr * expr1, gfc_expr * expr2)
 	   bound, bound, 0,
 	   GFC_ARRAY_POINTER_CONT, false);
 	  tmp = gfc_create_var (tmp, "ptrtemp");
-	  lse.descriptor_only = 0;
-	  lse.expr = tmp;
-	  lse.direct_byref = 1;
-	  gfc_conv_expr_descriptor (, expr2);
-	  strlen_rhs = lse.string_length;
+	  rse.descriptor_only = 0;
+	  rse.expr = tmp;
+	  rse.direct_byref = 1;
+	  gfc_conv_expr_descriptor (, expr2);
+	  strlen_rhs = rse.string_length;
 	  rse.expr = tmp;
 	}
 	  else
diff --git a/gcc/testsuite/gfortran.dg/pointer_remapping_10.f90 b/gcc/testsuite/gfortran.dg/pointer_remapping_10.f90
new file mode 100644
index 000..4810506
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pointer_remapping_10.f90
@@ -0,0 +1,46 @@
+! { dg-do run }
+! { dg-options "-fcheck=all" }
+!
+! PR fortran/71194
+!
+! Contributed by T Kondic
+!
+program ice
+implicit none
+integer, parameter :: pa=10, pb=20
+complex, target :: a(pa*pb)
+real, pointer:: ptr(:,:) =>null()
+integer :: i, j, cnt
+logical :: negative
+
+  do i = 1, size(a)
+a(i) = cmplx(i,-i)
+  end do
+
+  ! Was ICEing before with bounds checks
+  ptr(1:pa*2,1:pb) => conv2real(a)
+
+  negative = .false.
+  cnt = 1
+  do i = 1, ubound(ptr,dim=2)
+do j = 1, ubound(ptr,dim=1)
+  if (negative) then
+if (-cnt /= ptr(j, i)) call abort()
+cnt = cnt + 1
+negative = .false.
+  else
+if (cnt /= ptr(j, i)) call abort()
+negative = .true.
+  end if
+end do 
+  end do
+
+contains
+  function conv2real(carr)
+use, intrinsic :: iso_c_binding
+! returns real pointer to a complex array
+complex, contiguous, intent(inout), target  :: carr(:)
+real,contiguous,pointer :: conv2real(:)
+call c_f_pointer(c_loc(carr),conv2real,[size(carr)*2])
+  end function conv2real
+end program


Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-20 Thread Michael Matz
Hi,

On Fri, 17 Jun 2016, Bernd Schmidt wrote:

> > The "0xe9 " essentially is the leave+return opcode, 
> > after all it jumps to them (let's ignore the possibility that the jump 
> > target address might contain a 0xc3 byte).  So if the attacker finds 
> > some interesting gadget in  I don't see how the change 
> > from leave+ret to jump-to-leave+ret changes anything from a threat 
> > avoidance perspective.  It's fully possible that I don't understand 
> > the threat vector of ROP correctly, in which case I'd also like to 
> > know :)
> 
> The advantage is that this way the attack can't skip the leave opcode by 
> jumping into the "random bytes1" in your first sequence. Hence, we 
> ensure the return path will always overwrite esp first, which is what's 
> supposed to make the attack harder since now you need to control ebp as 
> well.

Okay, thanks.  So it's really the wish for an inseparable leave+ret 
leading to this; that sort of makes sense I guess.


Ciao,
Michael.
P.S: Though I do feel these ROP counter measures are not much more than 
security by obscurity; I guess enough obscurity indeed can at least lead 
to harder to exploit programs.


[PATCH] PR target/71549: Convert V1TImode register to TImode in debug insn

2016-06-20 Thread H.J. Lu
TImode register referenced in debug insn can be converted to V1TImode
by scalar to vector optimization.  We need to convert a debug insn if
it has a variable in a TImode register.

Tested on x86-64.  OK for trunk?


H.J.

gcc/

PR target/71549
* config/i386/i386.c (timode_scalar_to_vector_candidate_p):
Return true if debug insn has a variable in TImode register.
(timode_remove_non_convertible_regs): Skip debug insn.
(scalar_chain::convert_insn): Change return type to bool.
(scalar_chain::add_insn): Don't check registers in debug insn.
(dimode_scalar_chain::convert_insn): Change return type to bool
and always return true.
(timode_scalar_chain::convert_insn): Change return type to bool.
Convert V1TImode register to SUBREG TImode in debug insn.  Return
false if debug insn isn't converted.  Otherwise, return true.
(scalar_chain::convert): Increment converted_insns only if
convert_insn returns true.

gcc/testsuite/

PR target/71549
* gcc.target/i386/pr71549.c: New test.
---
 gcc/config/i386/i386.c  | 58 -
 gcc/testsuite/gcc.target/i386/pr71549.c | 24 ++
 2 files changed, 74 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56a5b9c..e17fc53 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2845,6 +2845,16 @@ dimode_scalar_to_vector_candidate_p (rtx_insn *insn)
 static bool
 timode_scalar_to_vector_candidate_p (rtx_insn *insn)
 {
+  if (DEBUG_INSN_P (insn))
+{
+  /* If a variable is put in a TImode register, which may be
+converted to V1TImode, we need to convert this debug insn.  */
+  rtx val = PATTERN (insn);
+  return (GET_MODE (val) == TImode
+ && GET_CODE (val) == VAR_LOCATION
+ && REG_P (PAT_VAR_LOCATION_LOC (val)));
+}
+
   rtx def_set = single_set (insn);
 
   if (!def_set)
@@ -3012,7 +3022,12 @@ timode_remove_non_convertible_regs (bitmap candidates)
 
   EXECUTE_IF_SET_IN_BITMAP (candidates, 0, id, bi)
 {
-  rtx def_set = single_set (DF_INSN_UID_GET (id)->insn);
+  rtx_insn *insn = DF_INSN_UID_GET (id)->insn;
+  /* Debug insn isn't a SET insn.  */
+  if (DEBUG_INSN_P (insn))
+   continue;
+
+  rtx def_set = single_set (insn);
   rtx dest = SET_DEST (def_set);
   rtx src = SET_SRC (def_set);
 
@@ -3111,7 +3126,7 @@ class scalar_chain
   void add_insn (bitmap candidates, unsigned insn_uid);
   void analyze_register_chain (bitmap candidates, df_ref ref);
   virtual void mark_dual_mode_def (df_ref def) = 0;
-  virtual void convert_insn (rtx_insn *insn) = 0;
+  virtual bool convert_insn (rtx_insn *insn) = 0;
   virtual void convert_registers () = 0;
 };
 
@@ -3123,7 +3138,7 @@ class dimode_scalar_chain : public scalar_chain
   void mark_dual_mode_def (df_ref def);
   rtx replace_with_subreg (rtx x, rtx reg, rtx subreg);
   void replace_with_subreg_in_insn (rtx_insn *insn, rtx reg, rtx subreg);
-  void convert_insn (rtx_insn *insn);
+  bool convert_insn (rtx_insn *insn);
   void convert_op (rtx *op, rtx_insn *insn);
   void convert_reg (unsigned regno);
   void make_vector_copies (unsigned regno);
@@ -3139,7 +3154,7 @@ class timode_scalar_chain : public scalar_chain
 
  private:
   void mark_dual_mode_def (df_ref def);
-  void convert_insn (rtx_insn *insn);
+  bool convert_insn (rtx_insn *insn);
   /* We don't convert registers to difference size.  */
   void convert_registers () {}
 };
@@ -3276,6 +3291,10 @@ scalar_chain::add_insn (bitmap candidates, unsigned int 
insn_uid)
   bitmap_set_bit (insns, insn_uid);
 
   rtx_insn *insn = DF_INSN_UID_GET (insn_uid)->insn;
+  /* Debug insn isn't a SET insn.  */
+  if (DEBUG_INSN_P (insn))
+return;
+
   rtx def_set = single_set (insn);
   if (def_set && REG_P (SET_DEST (def_set))
   && !HARD_REGISTER_P (SET_DEST (def_set)))
@@ -3708,7 +3727,7 @@ dimode_scalar_chain::convert_op (rtx *op, rtx_insn *insn)
 
 /* Convert INSN to vector mode.  */
 
-void
+bool
 dimode_scalar_chain::convert_insn (rtx_insn *insn)
 {
   rtx def_set = single_set (insn);
@@ -3788,13 +3807,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
   INSN_CODE (insn) = -1;
   recog_memoized (insn);
   df_insn_rescan (insn);
+
+  return true;
 }
 
 /* Convert INSN from TImode to V1T1mode.  */
 
-void
+bool
 timode_scalar_chain::convert_insn (rtx_insn *insn)
 {
+  if (DEBUG_INSN_P (insn))
+{
+  /* It must be a debug insn with a TImode variable in register.  */
+  rtx val = PATTERN (insn);
+  gcc_assert (GET_MODE (val) == TImode
+ && GET_CODE (val) == VAR_LOCATION);
+  rtx loc = PAT_VAR_LOCATION_LOC (val);
+  gcc_assert (REG_P (loc));
+  /* Convert V1TImode register, which has been updated by a SET
+insn before, to SUBREG TImode.  */
+  if 

Re: [PATCH v2, PING 1] Allocate constant size dynamic stack space in the prologue

2016-06-20 Thread Dominik Vogt
On Fri, May 06, 2016 at 10:44:15AM +0100, Dominik Vogt wrote:
> > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> > index 21f21c9..4d48afd 100644
> > --- a/gcc/cfgexpand.c
> > +++ b/gcc/cfgexpand.c
> ...
> > @@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> > stack_vars_data *data)
> >  
> >/* If there were any, allocate space.  */
> >if (large_size > 0)
> > -   large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
> > -  large_align, true);
> > +   {
> > + large_allocsize = GEN_INT (large_size);
> > + get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
> ...
> 
> See below.
> 
> > @@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> > stack_vars_data *data)
> >   /* Large alignment is only processed in the last pass.  */
> >   if (pred)
> > continue;
> > +
> > + if (large_allocsize && ! large_allocation_done)
> > +   {
> > + /* Allocate space the virtual stack vars area in the prologue.
> > +  */
> > + HOST_WIDE_INT loffset;
> > +
> > + loffset = alloc_stack_frame_space (INTVAL (large_allocsize),
> > +PREFERRED_STACK_BOUNDARY);
> 
> 1) Should this use PREFERRED_STACK_BOUNDARY or just STACK_BOUNDARY?
> 2) Is this the right place for rounding up, or should 
>it be done above, maybe in get_dynamic_stack_size?
> 
> Not sure whether this is the right 
> 
> > + large_base = get_dynamic_stack_base (loffset, large_align);
> > + large_allocation_done = true;
> > +   }
> >   gcc_assert (large_base != NULL);
> >  
> >   large_alloc += alignb - 1;
> 
> > diff --git a/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c 
> > b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> > new file mode 100644
> > index 000..e06a16c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> > @@ -0,0 +1,14 @@
> > +/* Verify that run time aligned local variables are aloocated in the 
> > prologue
> > +   in one pass together with normal local variables.  */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O0" } */
> > +
> > +extern void bar (void *, void *, void *);
> > +void foo (void)
> > +{
> > +  int i;
> > +  __attribute__ ((aligned(65536))) char runtime_aligned_1[512];
> > +  __attribute__ ((aligned(32768))) char runtime_aligned_2[1024];
> > +  bar (, _aligned_1, _aligned_2);
> > +}
> > +/* { dg-final { scan-assembler-times "cfi_def_cfa_offset" 2 { target { 
> > s390*-*-* } } } } */
> 
> I've no idea how to test this on other targets, or how to express
> the test in a target independent way.  The scan-assembler-times
> does not work on x86_64.
> 
> Ciao
> 
> Dominik ^_^  ^_^
> 
> -- 
> 
> Dominik Vogt
> IBM Germany
> 
> 


Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [Patch AArch64] Add some more missing intrinsics

2016-06-20 Thread James Greenhalgh
On Mon, Jun 13, 2016 at 05:31:40PM +0100, James Greenhalgh wrote:
> 
> Hi,
> 
> Inspired by Jiong's recent work, here are some more missing intrinsics,
> and a smoke test for each of them.
> 
> This patch covers:
> 
>   vcvt_n_f64_s64
>   vcvt_n_f64_u64
>   vcvt_n_s64_f64
>   vcvt_n_u64_f64
>   vcvt_f64_s64
>   vrecpe_f64
>   vcvt_f64_u64
>   vrecps_f64
> 
> Tested on aarch64-none-elf, and on an internal testsuite for Neon
> intrinsics.
> 
> Note that the new tests will ICE without the fixups in
> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00805.html
> 
> OK?

*ping*

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00977.html

Thanks,
James

> gcc/ChangeLog
> 
> 2016-06-10  James Greenhalgh  
> 
>   * config/aarch64/arm_neon.h (vcvt_n_f64_s64): New.
>   (vcvt_n_f64_u64): Likewise.
>   (vcvt_n_s64_f64): Likewise.
>   (vcvt_n_u64_f64): Likewise.
>   (vcvt_f64_s64): Likewise.
>   (vrecpe_f64): Likewise.
>   (vcvt_f64_u64): Likewise.
>   (vrecps_f64): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2016-06-10  James Greenhalgh  
> 
>   * gcc.target/aarch64/vcvt_f64_1.c: New.
>   * gcc.target/aarch64/vcvt_n_f64_1.c: New.
>   * gcc.target/aarch64/vrecp_f64_1.c: New.




Re: [Patch AArch64] Fixup to fcvt patterns added in r237200

2016-06-20 Thread James Greenhalgh
On Fri, Jun 10, 2016 at 01:29:39PM +0100, James Greenhalgh wrote:
> 
> Hi,
> 
> My autotester picked up some issues with the vcvt{ds}_n_* intrinsics
> added in r237200.
> 
> The iterators in this pattern do not resolve, as they have not been
> explicitly tied to the mode iterator (rather than the code iterator)
> used by the pattern.
> 
> This fixup adds the attribute tags, allowing the patterns to work
> correctly.
> 
> Additionally, the types assigned to these instructions were wrong, and
> would permit the immediate operand to be in a register. This will then
> develop in to an ICE as the patterns require an immediate operand, and so
> won't match. The ICE can be exposed by writing a wrapping function around
> the vcvtd_n_* intrinsics, which forces the immediate operand to a register.
> We have the infrastructure to error to the user rather than ICEing, but it
> needs some different types, which this patch adds.
> 
> I've checked this with an aarch64-none-elf test run, and run it through
> several rounds of my autotester for aarch64-none-elf and
> aarch64_be-none-elf.
> 
> OK?

*ping*

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00805.html

Thanks,
James

> ---
> 2016-06-10  James Greenhalgh  
> 
>   * config/aarch64/aarch64.md
>   (3): Add attributes to
>   iterators.
>   (3): Likewise.  Correct
>   attributes.
>   * config/aarch64/aarch64-builtins.c
>   (aarch64_types_binop_uss_qualifiers): Delete.
>   (TYPES_BINOP_USS): Likewise.
>   (aarch64_types_binop_sus_qualifiers): Likewise.
>   (TYPES_BINOP_SUS): Likewise.
>   (aarch64_types_fcvt_from_unsigned_qualifiers): New.
>   (TYPES_FCVTIMM_SUS): Likewise.
>   * config/aarch64/aarch64-simd-builtins.def (scvtf): Use SHIFTIMM
>   rather than BINOP.
>   (ucvtf): Use FCVTIMM_SUS rather than BINOP_SUS.
>   (fcvtzs): Use SHIFTIMM rather than BINOP.
>   (fcvtzu): Use SHIFTIMM_USS rather than BINOP_USS.
> 



Re: [AArch64] Give some new costs for Cortex-A57 floating-point operations

2016-06-20 Thread James Greenhalgh
On Fri, Jun 10, 2016 at 09:29:46AM +0100, James Greenhalgh wrote:
> On Fri, Jun 03, 2016 at 09:35:50AM +0100, James Greenhalgh wrote:
> > 
> > Hi,
> > 
> > This patch rebases the floating-point cost table for Cortex-A57 to be
> > relative to the cost of a floating-point move. This in response to this
> > feedback from Richard Sandiford [2] on Ramana's patch to calls.c [1] from
> > 2014:
> > 
> >   I think this is really a bug in the backend.  The backend is assigning a
> >   cost of COSTS_N_INSNS (3) to a floating-point constant not because the
> >   constant itself is expensive -- it's actually as cheap as a register
> >   in this context -- but because the backend considers floating-point
> >   moves to be 3 times more expensive than cheap integer moves.
> > 
> > The argument is that a move in mode X should be treated with cost
> > COSTS_N_INSNS (1), and other instructions should have a cost relative to
> > that move. For example, in this patch we say that instructions building a
> > floating-point constant are the same cost as a floating-point register to
> > register move. Fixing this fixes the issue Ramana was seeing, in a way
> > consistent with what other back-ends do.
> > 
> > This patch gives a small improvement to Spec2000FP on a Cortex-A57
> > platform.
> > 
> > Bootstrapped on aarch64-none-linux-gnu with no issues.
> > 
> > OK?
> 
> *ping*

*ping^*

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00251.html

Thanks,
James

> > 2016-06-03  James Greenhalgh  
> > 
> > * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): Make FP
> > costs relative to the cost of a register move.
> > 
> 



Re: [wwwdocs] Describe behavior of -flifetime-dse in class constructors

2016-06-20 Thread Gerald Pfeifer
On Mon, 20 Jun 2016, Gerald Pfeifer wrote:
> I know a short version of this was applied, but am wondering
> whether to retain the example (and a note on -flifetime-dse=1),
> both per the discussion in February?
> 
> Want to make those enhancements?

And here is one small change I just applied...

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.85
diff -u -r1.85 changes.html
--- changes.html8 Jun 2016 15:00:53 -   1.85
+++ changes.html20 Jun 2016 10:33:35 -
@@ -283,7 +283,7 @@
 -fconcepts.
 -flifetime-dse is more
 aggressive in dead-store elimination in situations where
-a memory store to a location precedes a constructor to the
+a memory store to a location precedes a constructor to that
 memory location.
 G++ now supports
 https://gcc.gnu.org/projects/cxx-status.html#cxx1z.html;>C++17


Re: [wwwdocs] Describe behavior of -flifetime-dse in class constructors

2016-06-20 Thread Gerald Pfeifer
Hi Martin,

On Wed, 17 Feb 2016, Martin Liška wrote:
> On 02/17/2016 03:23 PM, Jakub Jelinek wrote:
>> "has been" looks weird.  I'd say that the C++ compiler is now more
>> aggressive...
> Sending v3.

I know a short version of this was applied, but am wondering
whether to retain the example (and a note on -flifetime-dse=1),
both per the discussion in February?

Want to make those enhancements?

Gerald

Index: htdocs/gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.14
diff --unified -r1.14 porting_to.html
--- htdocs/gcc-6/porting_to.html14 Feb 2016 13:13:43 -  1.14
+++ htdocs/gcc-6/porting_to.html17 Feb 2016 15:00:35 -
@@ -324,6 +324,52 @@
 -fabi-version or -Wabi option to disable or warn about.
 
 
+More aggressive optimization of -flifetime-dse
+
+
+The C++ compiler (with enabled -flifetime-dse)
+is more aggressive in dead-store elimination in situations where
+a memory store to a location precedes a constructor to the
+memory location. Described situation can be commonly found in programs
+which zero a memory that is eventually passed to a placement new operator:
+
+
+#include stdlib.h
+#include string.h
+#include assert.h
+
+struct A
+{
+  A () {}
+  void *operator new (size_t s)
+  {
+void *ptr = malloc (s);
+memset (ptr, 0, s);
+return ptr;
+  }
+
+  int value;
+};
+
+A *
+__attribute__ ((noinline))
+build (void)
+{
+  return new A ();
+}
+
+int main()
+{
+  A *a =  build ();
+  assert (a-value == 0); /* Use of uninitialized value */
+  free (a);
+}
+
+
+If the program cannot be fixed to remove the undefined behavior then
+the option -fno-lifetime-dse can be used to disable
+this optimization.
+
 -Wmisleading-indentation
 
 A new warning -Wmisleading-indentation was added

RE: [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE

2016-06-20 Thread VandeVondele Joost
From my point of view, would be really nice to have.

Joost

Re: [PING] [PATCH] c/69507 - bogus warning: ISO C does not allow ‘__alignof__ (expression)’

2016-06-20 Thread Christophe Lyon
On 4 June 2016 at 23:24, Martin Sebor  wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02216.html
>
>
> On 05/27/2016 11:34 AM, Martin Sebor wrote:
>>
>> The patch below adjusts the C alignof pedantic warning to avoid
>> diagnosing the GCC extension (__alignof__) and only diagnose
>> _Alignof in C99 and prior modes.  This is consistent with how
>> __attribute__ ((aligned)) and _Alignas is handled (among other
>> extensions vs standard features).
>>
>> Martin
>>
>> PR c/69507 - bogus warning: ISO C does not allow ‘__alignof__
>> (expression)’
>>
>> gcc/testsuite/ChangeLog:
>> 2016-05-27  Martin Sebor  
>>
>>  PR c/69507
>>  * gcc.dg/alignof.c: New test.
>>
>> gcc/c/ChangeLog:
>> 2016-05-27  Martin Sebor  
>>
>>  PR c/69507
>>  * c-parser.c (c_parser_alignof_expression): Avoid diagnosing
>>  __alignof__ (expression).
>>

Hi,

Since this patch was committed, I am now seeing failures on:
gcc.dg/gnu99-const-expr-1.c
gcc.dg/gnu99-static-1.c

(targets arm, aarch64, I don't think that it should matter?)

Can you have  a look?

Christophe



>> Index: gcc/c/c-parser.c
>> ===
>> --- gcc/c/c-parser.c(revision 232841)
>> +++ gcc/c/c-parser.c(working copy)
>> @@ -7019,9 +7019,10 @@ c_parser_alignof_expression (c_parser *p
>> mark_exp_read (expr.value);
>> c_inhibit_evaluation_warnings--;
>> in_alignof--;
>> -  pedwarn (start_loc,
>> -   OPT_Wpedantic, "ISO C does not allow %<%E (expression)%>",
>> -   alignof_spelling);
>> +  if (is_c11_alignof)
>> +pedwarn (start_loc,
>> + OPT_Wpedantic, "ISO C does not allow %<%E (expression)%>",
>> + alignof_spelling);
>> ret.value = c_alignof_expr (start_loc, expr.value);
>> ret.original_code = ERROR_MARK;
>> ret.original_type = NULL;
>> Index: gcc/testsuite/gcc.dg/alignof.c
>> ===
>> --- gcc/testsuite/gcc.dg/alignof.c(revision 0)
>> +++ gcc/testsuite/gcc.dg/alignof.c(working copy)
>> @@ -0,0 +1,11 @@
>> +/* PR c/69507 - bogus warning: ISO C does not allow '__alignof__
>> (expression)'
>> + */
>> +/* { dg-do compile } */
>> +/* { dg-options "-std=c11 -Wno-error -Wpedantic" } */
>> +
>> +extern int e;
>> +
>> +int a[] = {
>> +__alignof__ (e),
>> +_Alignof (e)   /* { dg-warning "ISO C does not allow ._Alignof
>> \\(expression\\)." } */
>> +};
>
>


[PATCH 6/6] loop-iv.c: make cond_list a vec

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* loop-iv.c (simplify_using_initial_values): Make cond_list a vector.
---
 gcc/loop-iv.c | 55 ++-
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 57fb8c1..21c3180 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -1860,7 +1860,6 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
 {
   bool expression_valid;
   rtx head, tail, last_valid_expr;
-  rtx_expr_list *cond_list;
   rtx_insn *insn;
   rtx neutral, aggr;
   regset altered, this_altered;
@@ -1936,7 +1935,7 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
 
   expression_valid = true;
   last_valid_expr = *expr;
-  cond_list = NULL;
+  auto_vec cond_list;
   while (1)
 {
   insn = BB_END (e->src);
@@ -1952,17 +1951,18 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
  simplify_using_condition (cond, expr, altered);
  if (old != *expr)
{
- rtx note;
  if (CONSTANT_P (*expr))
goto out;
- for (note = cond_list; note; note = XEXP (note, 1))
+
+ unsigned int len = cond_list.length ();
+ for (unsigned int i = len - 1; i < len; i--)
{
- simplify_using_condition (XEXP (note, 0), expr, altered);
+ simplify_using_condition (cond_list[i], expr, altered);
  if (CONSTANT_P (*expr))
goto out;
}
}
- cond_list = alloc_EXPR_LIST (0, cond, cond_list);
+ cond_list.safe_push (cond);
}
}
 
@@ -1988,39 +1988,30 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
 
  if (suitable_set_for_replacement (insn, , ))
{
- rtx_expr_list **pnote, **pnote_next;
-
  replace_in_expr (expr, dest, src);
  if (CONSTANT_P (*expr))
goto out;
 
- for (pnote = _list; *pnote; pnote = pnote_next)
+ unsigned int len = cond_list.length ();
+ for (unsigned int i = len - 1; i < len; i--)
{
- rtx_expr_list *note = *pnote;
- rtx old_cond = XEXP (note, 0);
+ rtx old_cond = cond_list[i];
 
- pnote_next = (rtx_expr_list **) (note, 1);
- replace_in_expr ( (note, 0), dest, src);
+ replace_in_expr (_list[i], dest, src);
 
  /* We can no longer use a condition that has been simplified
 to a constant, and simplify_using_condition will abort if
 we try.  */
- if (CONSTANT_P (XEXP (note, 0)))
-   {
- *pnote = *pnote_next;
- pnote_next = pnote;
- free_EXPR_LIST_node (note);
-   }
+ if (CONSTANT_P (cond_list[i]))
+   cond_list.ordered_remove (i);
  /* Retry simplifications with this condition if either the
 expression or the condition changed.  */
- else if (old_cond != XEXP (note, 0) || old != *expr)
-   simplify_using_condition (XEXP (note, 0), expr, altered);
+ else if (old_cond != cond_list[i] || old != *expr)
+   simplify_using_condition (cond_list[i], expr, altered);
}
}
  else
{
- rtx_expr_list **pnote, **pnote_next;
-
  /* If we did not use this insn to make a replacement, any overlap
 between stores in this insn and our expression will cause the
 expression to become invalid.  */
@@ -2028,19 +2019,10 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
goto out;
 
  /* Likewise for the conditions.  */
- for (pnote = _list; *pnote; pnote = pnote_next)
-   {
- rtx_expr_list *note = *pnote;
- rtx old_cond = XEXP (note, 0);
-
- pnote_next = (rtx_expr_list **) (note, 1);
- if (altered_reg_used (old_cond, this_altered))
-   {
- *pnote = *pnote_next;
- pnote_next = pnote;
- free_EXPR_LIST_node (note);
-   }
-   }
+ unsigned int len = cond_list.length ();
+ for (unsigned int i = len - 1; i < len; i--)
+   if (altered_reg_used (cond_list[i], this_altered))
+ cond_list.ordered_remove (i);
}
 
  if 

[PATCH 1/6] make antic_stores a vec

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* store-motion.c (struct st_expr): Make antic_stores a vector.
(st_expr_entry): Adjust.
(free_st_expr_entry): Likewise.
(print_store_motion_mems): Likewise.
(find_moveable_store): Likewise.
(compute_store_table): Likewise.
(remove_reachable_equiv_notes): Likewise.
(replace_store_insn): Likewise.
(build_store_vectors): Likewise.
---
 gcc/store-motion.c | 54 +-
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/gcc/store-motion.c b/gcc/store-motion.c
index 301b69b..6d7d37f 100644
--- a/gcc/store-motion.c
+++ b/gcc/store-motion.c
@@ -46,7 +46,6 @@ along with GCC; see the file COPYING3.  If not see
  a compile time hog that needs a rewrite (maybe cache st_exprs to
  invalidate REG_EQUAL/REG_EQUIV notes for?).
- pattern_regs in st_expr should be a regset (on its own obstack).
-   - antic_stores and avail_stores should be VECs instead of lists.
- store_motion_mems should be a vec instead of a list.
- there should be an alloc pool for struct st_expr objects.
- investigate whether it is helpful to make the address of an st_expr
@@ -66,7 +65,7 @@ struct st_expr
   /* List of registers mentioned by the mem.  */
   rtx pattern_regs;
   /* INSN list of stores that are locally anticipatable.  */
-  rtx_insn_list *antic_stores;
+  vec antic_stores;
   /* INSN list of stores that are locally available.  */
   vec avail_stores;
   /* Next in the list.  */
@@ -148,7 +147,7 @@ st_expr_entry (rtx x)
   ptr->next = store_motion_mems;
   ptr->pattern  = x;
   ptr->pattern_regs = NULL_RTX;
-  ptr->antic_stores = NULL;
+  ptr->antic_stores.create (0);
   ptr->avail_stores.create (0);
   ptr->reaching_reg = NULL_RTX;
   ptr->index= 0;
@@ -164,8 +163,8 @@ st_expr_entry (rtx x)
 static void
 free_st_expr_entry (struct st_expr * ptr)
 {
-  free_INSN_LIST_list (& ptr->antic_stores);
-   ptr->avail_stores.release ();
+  ptr->antic_stores.release ();
+  ptr->avail_stores.release ();
 
   free (ptr);
 }
@@ -233,11 +232,7 @@ print_store_motion_mems (FILE * file)
   print_rtl (file, ptr->pattern);
 
   fprintf (file, "\nANTIC stores : ");
-
-  if (ptr->antic_stores)
-   print_rtl (file, ptr->antic_stores);
-  else
-   fprintf (file, "(nil)");
+  print_rtx_insn_vec (file, ptr->antic_stores);
 
   fprintf (file, "\nAVAIL stores : ");
 
@@ -566,11 +561,11 @@ find_moveable_store (rtx_insn *insn, int 
*regs_set_before, int *regs_set_after)
   /* Do not check for anticipatability if we either found one anticipatable
  store already, or tested for one and found out that it was killed.  */
   check_anticipatable = 0;
-  if (!ptr->antic_stores)
+  if (ptr->antic_stores.is_empty ())
 check_anticipatable = 1;
   else
 {
-  rtx_insn *tmp = ptr->antic_stores->insn ();
+  rtx_insn *tmp = ptr->antic_stores.last ();
   if (tmp != NULL_RTX
  && BLOCK_FOR_INSN (tmp) != bb)
check_anticipatable = 1;
@@ -582,7 +577,7 @@ find_moveable_store (rtx_insn *insn, int *regs_set_before, 
int *regs_set_after)
tmp = NULL;
   else
tmp = insn;
-  ptr->antic_stores = alloc_INSN_LIST (tmp, ptr->antic_stores);
+  ptr->antic_stores.safe_push (tmp);
 }
 
   /* It is not necessary to check whether store is available if we did
@@ -683,9 +678,9 @@ compute_store_table (void)
   for (ptr = first_st_expr (); ptr != NULL; ptr = next_st_expr (ptr))
{
  LAST_AVAIL_CHECK_FAILURE (ptr) = NULL_RTX;
- if (ptr->antic_stores
- && (tmp = ptr->antic_stores->insn ()) == NULL_RTX)
-   ptr->antic_stores = ptr->antic_stores->next ();
+ if (!ptr->antic_stores.is_empty ()
+ && (tmp = ptr->antic_stores.last ()) == NULL)
+   ptr->antic_stores.pop ();
}
 }
 
@@ -831,7 +826,7 @@ remove_reachable_equiv_notes (basic_block bb, struct 
st_expr *smexpr)
   int sp;
   edge act;
   sbitmap visited = sbitmap_alloc (last_basic_block_for_fn (cfun));
-  rtx last, note;
+  rtx note;
   rtx_insn *insn;
   rtx mem = smexpr->pattern;
 
@@ -866,13 +861,13 @@ remove_reachable_equiv_notes (basic_block bb, struct 
st_expr *smexpr)
}
   bitmap_set_bit (visited, bb->index);
 
+  rtx_insn *last;
   if (bitmap_bit_p (st_antloc[bb->index], smexpr->index))
{
- for (last = smexpr->antic_stores;
-  BLOCK_FOR_INSN (XEXP (last, 0)) != bb;
-  last = XEXP (last, 1))
-   continue;
- last = XEXP (last, 0);
+ unsigned int i;
+ FOR_EACH_VEC_ELT_REVERSE (smexpr->antic_stores, i, last)
+   if (BLOCK_FOR_INSN (last) == bb)
+ break;
}
   else
last = NEXT_INSN (BB_END (bb));
@@ -911,15 +906,17 @@ 

[PATCH 5/6] make pattern_regs a vec

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* store-motion.c (struct st_expr): Make pattern_regs a vector.
(st_expr_entry): Adjust.
(store_ops_ok): Likewise.
(extract_mentioned_regs): Likewise.
(store_killed_in_insn): Likewise.
(find_moveable_store): Likewise.
---
 gcc/store-motion.c | 49 +++--
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/gcc/store-motion.c b/gcc/store-motion.c
index 6d7d37f..c2ef0d0 100644
--- a/gcc/store-motion.c
+++ b/gcc/store-motion.c
@@ -63,7 +63,7 @@ struct st_expr
   /* Pattern of this mem.  */
   rtx pattern;
   /* List of registers mentioned by the mem.  */
-  rtx pattern_regs;
+  vec pattern_regs;
   /* INSN list of stores that are locally anticipatable.  */
   vec antic_stores;
   /* INSN list of stores that are locally available.  */
@@ -146,7 +146,7 @@ st_expr_entry (rtx x)
 
   ptr->next = store_motion_mems;
   ptr->pattern  = x;
-  ptr->pattern_regs = NULL_RTX;
+  ptr->pattern_regs.create (0);
   ptr->antic_stores.create (0);
   ptr->avail_stores.create (0);
   ptr->reaching_reg = NULL_RTX;
@@ -248,16 +248,12 @@ print_store_motion_mems (FILE * file)
due to set of registers in bitmap REGS_SET.  */
 
 static bool
-store_ops_ok (const_rtx x, int *regs_set)
+store_ops_ok (const vec , int *regs_set)
 {
-  const_rtx reg;
-
-  for (; x; x = XEXP (x, 1))
-{
-  reg = XEXP (x, 0);
-  if (regs_set[REGNO (reg)])
-   return false;
-}
+  unsigned int len = x.length ();
+  for (unsigned int i = 0; i < len; i++)
+if (regs_set[REGNO (x[i])])
+  return false;
 
   return true;
 }
@@ -265,18 +261,16 @@ store_ops_ok (const_rtx x, int *regs_set)
 /* Returns a list of registers mentioned in X.
FIXME: A regset would be prettier and less expensive.  */
 
-static rtx_expr_list *
-extract_mentioned_regs (rtx x)
+static void
+extract_mentioned_regs (rtx x, vec *mentioned_regs)
 {
-  rtx_expr_list *mentioned_regs = NULL;
   subrtx_var_iterator::array_type array;
   FOR_EACH_SUBRTX_VAR (iter, array, x, NONCONST)
 {
   rtx x = *iter;
   if (REG_P (x))
-   mentioned_regs = alloc_EXPR_LIST (0, x, mentioned_regs);
+   mentioned_regs->safe_push (x);
 }
-  return mentioned_regs;
 }
 
 /* Check to see if the load X is aliased with STORE_PATTERN.
@@ -373,9 +367,10 @@ store_killed_in_pat (const_rtx x, const_rtx pat, int after)
after the insn.  Return true if it does.  */
 
 static bool
-store_killed_in_insn (const_rtx x, const_rtx x_regs, const rtx_insn *insn, int 
after)
+store_killed_in_insn (const_rtx x, const vec _regs,
+ const rtx_insn *insn, int after)
 {
-  const_rtx reg, note, pat;
+  const_rtx note, pat;
 
   if (! NONDEBUG_INSN_P (insn))
 return false;
@@ -389,8 +384,9 @@ store_killed_in_insn (const_rtx x, const_rtx x_regs, const 
rtx_insn *insn, int a
 
   /* But even a const call reads its parameters.  Check whether the
 base of some of registers used in mem is stack pointer.  */
-  for (reg = x_regs; reg; reg = XEXP (reg, 1))
-   if (may_be_sp_based_p (XEXP (reg, 0)))
+  unsigned int len = x_regs.length ();
+  for (unsigned int i = 0; i < len; i++)
+   if (may_be_sp_based_p (x_regs[i]))
  return true;
 
   return false;
@@ -435,8 +431,8 @@ store_killed_in_insn (const_rtx x, const_rtx x_regs, const 
rtx_insn *insn, int a
is killed, return the last insn in that it occurs in FAIL_INSN.  */
 
 static bool
-store_killed_after (const_rtx x, const_rtx x_regs, const rtx_insn *insn,
-   const_basic_block bb,
+store_killed_after (const_rtx x, const vec _regs,
+   const rtx_insn *insn, const_basic_block bb,
int *regs_set_after, rtx *fail_insn)
 {
   rtx_insn *last = BB_END (bb), *act;
@@ -465,8 +461,9 @@ store_killed_after (const_rtx x, const_rtx x_regs, const 
rtx_insn *insn,
within basic block BB. X_REGS is list of registers mentioned in X.
REGS_SET_BEFORE is bitmap of registers set before or in this insn.  */
 static bool
-store_killed_before (const_rtx x, const_rtx x_regs, const rtx_insn *insn,
-const_basic_block bb, int *regs_set_before)
+store_killed_before (const_rtx x, const vec _regs,
+const rtx_insn *insn, const_basic_block bb,
+int *regs_set_before)
 {
   rtx_insn *first = BB_HEAD (bb);
 
@@ -555,8 +552,8 @@ find_moveable_store (rtx_insn *insn, int *regs_set_before, 
int *regs_set_after)
 return;
 
   ptr = st_expr_entry (dest);
-  if (!ptr->pattern_regs)
-ptr->pattern_regs = extract_mentioned_regs (dest);
+  if (ptr->pattern_regs.is_empty ())
+extract_mentioned_regs (dest, >pattern_regs);
 
   /* Do not check for anticipatability if we either found one anticipatable
  store already, or tested for one and found out that it was 

[PATCH 4/6] make side_effects a vec

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* var-tracking.c (struct adjust_mem_data): Make side_effects a vector.
(adjust_mems): Adjust.
(adjust_insn): Likewise.
(prepare_call_arguments): Likewise.
---
 gcc/var-tracking.c | 33 ++---
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 9f09d30..5d09879 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -926,7 +926,7 @@ struct adjust_mem_data
   bool store;
   machine_mode mem_mode;
   HOST_WIDE_INT stack_adjust;
-  rtx_expr_list *side_effects;
+  auto_vec side_effects;
 };
 
 /* Helper for adjust_mems.  Return true if X is suitable for
@@ -1072,9 +1072,7 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data)
   amd->store = false;
   tem = simplify_replace_fn_rtx (tem, old_rtx, adjust_mems, data);
   amd->store = store_save;
-  amd->side_effects = alloc_EXPR_LIST (0,
-  gen_rtx_SET (XEXP (loc, 0), tem),
-  amd->side_effects);
+  amd->side_effects.safe_push (gen_rtx_SET (XEXP (loc, 0), tem));
   return addr;
 case PRE_MODIFY:
   addr = XEXP (loc, 1);
@@ -1088,9 +1086,7 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data)
   tem = simplify_replace_fn_rtx (XEXP (loc, 1), old_rtx,
 adjust_mems, data);
   amd->store = store_save;
-  amd->side_effects = alloc_EXPR_LIST (0,
-  gen_rtx_SET (XEXP (loc, 0), tem),
-  amd->side_effects);
+  amd->side_effects.safe_push (gen_rtx_SET (XEXP (loc, 0), tem));
   return addr;
 case SUBREG:
   /* First try without delegitimization of whole MEMs and
@@ -1184,7 +1180,6 @@ adjust_mem_stores (rtx loc, const_rtx expr, void *data)
 static void
 adjust_insn (basic_block bb, rtx_insn *insn)
 {
-  struct adjust_mem_data amd;
   rtx set;
 
 #ifdef HAVE_window_save
@@ -1213,9 +1208,9 @@ adjust_insn (basic_block bb, rtx_insn *insn)
 }
 #endif
 
+  adjust_mem_data amd;
   amd.mem_mode = VOIDmode;
   amd.stack_adjust = -VTI (bb)->out.stack_adjust;
-  amd.side_effects = NULL;
 
   amd.store = true;
   note_stores (PATTERN (insn), adjust_mem_stores, );
@@ -1281,10 +1276,10 @@ adjust_insn (basic_block bb, rtx_insn *insn)
validate_change (NULL_RTX, _SRC (set), XEXP (note, 0), true);
 }
 
-  if (amd.side_effects)
+  if (!amd.side_effects.is_empty ())
 {
-  rtx *pat, new_pat, s;
-  int i, oldn, newn;
+  rtx *pat, new_pat;
+  int i, oldn;
 
   pat =  (insn);
   if (GET_CODE (*pat) == COND_EXEC)
@@ -1293,17 +1288,18 @@ adjust_insn (basic_block bb, rtx_insn *insn)
oldn = XVECLEN (*pat, 0);
   else
oldn = 1;
-  for (s = amd.side_effects, newn = 0; s; newn++)
-   s = XEXP (s, 1);
+  unsigned int newn = amd.side_effects.length ();
   new_pat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (oldn + newn));
   if (GET_CODE (*pat) == PARALLEL)
for (i = 0; i < oldn; i++)
  XVECEXP (new_pat, 0, i) = XVECEXP (*pat, 0, i);
   else
XVECEXP (new_pat, 0, 0) = *pat;
-  for (s = amd.side_effects, i = oldn; i < oldn + newn; i++, s = XEXP (s, 
1))
-   XVECEXP (new_pat, 0, i) = XEXP (s, 0);
-  free_EXPR_LIST_list (_effects);
+
+  rtx effect;
+  unsigned int j;
+  FOR_EACH_VEC_ELT_REVERSE (amd.side_effects, j, effect)
+   XVECEXP (new_pat, 0, j + oldn) = effect;
   validate_change (NULL_RTX, pat, new_pat, true);
 }
 }
@@ -6335,11 +6331,10 @@ prepare_call_arguments (basic_block bb, rtx_insn *insn)
struct adjust_mem_data amd;
amd.mem_mode = VOIDmode;
amd.stack_adjust = -VTI (bb)->out.stack_adjust;
-   amd.side_effects = NULL;
amd.store = true;
mem = simplify_replace_fn_rtx (mem, NULL_RTX, adjust_mems,
   );
-   gcc_assert (amd.side_effects == NULL_RTX);
+   gcc_assert (amd.side_effects.is_empty ());
  }
val = cselib_lookup (mem, GET_MODE (mem), 0, VOIDmode);
if (val && cselib_preserved_value_p (val))
-- 
2.7.4



[PATCH 0/6] remove some usage of rtx_{insn,expr}_list

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

These few patches to get rid of rtx insn and expr lists should be pretty un
controvertial.  In each case the list is clearly used as a stack and using a
vec as a stack is clearly the same.

In theory I would expect if anything this helps performance since it isn't
necessary to malloc every time a node is added, however the data is less clear.
Here are times for O2 and O0 for fold-const.ii from gcc, and
Unified_js_src_cpp_32.ii from Spider Monkey.  These are best of 3 based on user
time.

fold const O2 new
real0m5.034s
user0m3.408s
sys 0m0.364s

fold const O2 old
real0m4.012s
user0m3.420s
sys 0m0.340s

fold const O0 new
real0m1.637s
user0m1.124s
sys 0m0.280s

fold const O0 old
real0m2.483s
user0m1.092s
sys 0m0.280s

mozjs O2 new
real0m15.565s
user0m12.420s
sys 0m1.536s

mozjs O2 old
real0m13.662s
user0m12.136s
sys 0m1.440s

mozjs O0 new
real0m9.860s
user0m6.796s
sys 0m1.368s

mozjs O0 old
real0m8.922s
user0m6.888s
sys 0m1.388s

So a couple got about .3s slower, and others got about .1 faster, I'm not
really sure but inclined to say any change is too small to easily measure.

bootstrapped + regtested patches individually on x86_64-linux-gnu, ok?

Trev

Trevor Saunders (6):
  make antic_stores a vec
  remove unused loads rtx_insn_list
  make stores rtx_insn_list a vec
  make side_effects a vec
  make pattern_regs a vec
  loop-iv.c: make cond_list a vec

 gcc/gcse.c |  36 ++-
 gcc/loop-iv.c  |  55 ++--
 gcc/store-motion.c | 103 +
 gcc/var-tracking.c |  33 -
 4 files changed, 90 insertions(+), 137 deletions(-)

-- 
2.7.4



[PATCH 3/6] make stores rtx_insn_list a vec

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* gcse.c (struct ls_expr): Make stores field a vector.
(ldst_entry): Adjust.
(free_ldst_entry): Likewise.
(print_ldst_list): Likewise.
(compute_ld_motion_mems): Likewise.
(update_ld_motion_stores): Likewise.
---
 gcc/gcse.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/gcc/gcse.c b/gcc/gcse.c
index 127a65a..49534f2 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -143,6 +143,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "df.h"
 #include "tm_p.h"
 #include "insn-config.h"
+#include "print-rtl.h"
 #include "regs.h"
 #include "ira.h"
 #include "recog.h"
@@ -342,7 +343,7 @@ struct ls_expr
   struct gcse_expr * expr; /* Gcse expression reference for LM.  */
   rtx pattern; /* Pattern of this mem.  */
   rtx pattern_regs;/* List of registers mentioned by the mem.  */
-  rtx_insn_list *stores;   /* INSN list of stores seen.  */
+  vec stores;  /* INSN list of stores seen.  */
   struct ls_expr * next;   /* Next in the list.  */
   int invalid; /* Invalid for some reason.  */
   int index;   /* If it maps to a bitmap index.  */
@@ -3604,7 +3605,7 @@ ldst_entry (rtx x)
   ptr->expr = NULL;
   ptr->pattern  = x;
   ptr->pattern_regs = NULL_RTX;
-  ptr->stores   = NULL;
+  ptr->stores.create (0);
   ptr->reaching_reg = NULL_RTX;
   ptr->invalid  = 0;
   ptr->index= 0;
@@ -3620,7 +3621,7 @@ ldst_entry (rtx x)
 static void
 free_ldst_entry (struct ls_expr * ptr)
 {
-  free_INSN_LIST_list (& ptr->stores);
+  ptr->stores.release ();
 
   free (ptr);
 }
@@ -3661,11 +3662,7 @@ print_ldst_list (FILE * file)
   print_rtl (file, ptr->pattern);
 
   fprintf (file, "\n   Stores : ");
-
-  if (ptr->stores)
-   print_rtl (file, ptr->stores);
-  else
-   fprintf (file, "(nil)");
+  print_rtx_insn_vec (file, ptr->stores);
 
   fprintf (file, "\n\n");
 }
@@ -3822,7 +3819,7 @@ compute_ld_motion_mems (void)
 returns 0 for all REGs.  */
  && can_assign_to_reg_without_clobbers_p (src,
src_mode))
-   ptr->stores = alloc_INSN_LIST (insn, ptr->stores);
+   ptr->stores.safe_push (insn);
  else
ptr->invalid = 1;
}
@@ -3915,11 +3912,10 @@ update_ld_motion_stores (struct gcse_expr * expr)
 where reg is the reaching reg used in the load.  We checked in
 compute_ld_motion_mems that we can replace (set mem expr) with
 (set reg expr) in that insn.  */
-  rtx list = mem_ptr->stores;
-
-  for ( ; list != NULL_RTX; list = XEXP (list, 1))
+  rtx_insn *insn;
+  unsigned int i;
+  FOR_EACH_VEC_ELT_REVERSE (mem_ptr->stores, i, insn)
{
- rtx_insn *insn = as_a  (XEXP (list, 0));
  rtx pat = PATTERN (insn);
  rtx src = SET_SRC (pat);
  rtx reg = expr->reaching_reg;
-- 
2.7.4



[PATCH 2/6] remove unused loads rtx_insn_list

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* gcse.c (struct ls_expr): Remove loads field.
(ldst_entry): Adjust.
(free_ldst_entry): Likewise.
(print_ldst_list): Likewise.
(compute_ld_motion_mems): Likewise.
---
 gcc/gcse.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/gcc/gcse.c b/gcc/gcse.c
index a3a7dc3..127a65a 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -342,7 +342,6 @@ struct ls_expr
   struct gcse_expr * expr; /* Gcse expression reference for LM.  */
   rtx pattern; /* Pattern of this mem.  */
   rtx pattern_regs;/* List of registers mentioned by the mem.  */
-  rtx_insn_list *loads;/* INSN list of loads seen.  */
   rtx_insn_list *stores;   /* INSN list of stores seen.  */
   struct ls_expr * next;   /* Next in the list.  */
   int invalid; /* Invalid for some reason.  */
@@ -3605,7 +3604,6 @@ ldst_entry (rtx x)
   ptr->expr = NULL;
   ptr->pattern  = x;
   ptr->pattern_regs = NULL_RTX;
-  ptr->loads= NULL;
   ptr->stores   = NULL;
   ptr->reaching_reg = NULL_RTX;
   ptr->invalid  = 0;
@@ -3622,7 +3620,6 @@ ldst_entry (rtx x)
 static void
 free_ldst_entry (struct ls_expr * ptr)
 {
-  free_INSN_LIST_list (& ptr->loads);
   free_INSN_LIST_list (& ptr->stores);
 
   free (ptr);
@@ -3663,13 +3660,6 @@ print_ldst_list (FILE * file)
 
   print_rtl (file, ptr->pattern);
 
-  fprintf (file, "\nLoads : ");
-
-  if (ptr->loads)
-   print_rtl (file, ptr->loads);
-  else
-   fprintf (file, "(nil)");
-
   fprintf (file, "\n   Stores : ");
 
   if (ptr->stores)
@@ -3801,9 +3791,7 @@ compute_ld_motion_mems (void)
  if (MEM_P (src) && simple_mem (src))
{
  ptr = ldst_entry (src);
- if (REG_P (dest))
-   ptr->loads = alloc_INSN_LIST (insn, ptr->loads);
- else
+ if (!REG_P (dest))
ptr->invalid = 1;
}
  else
-- 
2.7.4



[PATCH][typo] alignement -> alignment

2016-06-20 Thread Kyrill Tkachov

Hi all,

Committing the attached typo fix as obvious (I believe "alignement" is the 
French form).

Thanks,
Kyrill

2016-06-20  Kyrylo Tkachov  

* params.def (PARAM_ALIGN_LOOP_ITERATIONS): Use "alignment" instead of
"alignement".
* tree.h (TYPE_ALIGN): Likewise.

2016-06-20  Kyrylo Tkachov  

* exp_util.adb (Safe_Unchecked_Type_Conversion): Use "alignment"
instead of "alignement".

2016-06-20  Kyrylo Tkachov  

* gfortran.dg/common_align_2.f90: Use "alignment" instead of
"alignement".
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index fcd16a26cb0126113616534f3e81926d0b1a2a83..bed9ac1641d2fa6144e6c34e790a9bd8c059e60f 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -8645,7 +8645,7 @@ package body Exp_Util is
   --  alignment is known to be at least the maximum alignment for the
   --  target or if both alignments are known and the output type's
   --  alignment is no stricter than the input's. We can use the component
-  --  type alignement for an array if a type is an unpacked array type.
+  --  type alignment for an array if a type is an unpacked array type.
 
   if Present (Alignment_Clause (Otyp)) then
  Oalign := Expr_Value (Expression (Alignment_Clause (Otyp)));
diff --git a/gcc/params.def b/gcc/params.def
index a8630463eb84bb2a71354794971ea189e4901870..62ec600ba3c88dde78150fae63591e0855e93752 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -405,7 +405,7 @@ DEFPARAM (PARAM_ALIGN_THRESHOLD,
 
 DEFPARAM (PARAM_ALIGN_LOOP_ITERATIONS,
 	  "align-loop-iterations",
-	  "Loops iterating at least selected number of iterations will get loop alignement..",
+	  "Loops iterating at least selected number of iterations will get loop alignment..",
 	  4, 0, 0)
 
 /* For guessed profiles, the loops having unknown number of iterations
diff --git a/gcc/testsuite/gfortran.dg/common_align_2.f90 b/gcc/testsuite/gfortran.dg/common_align_2.f90
index 09dd3e1fa0a4880333964db6e9d0d7582fbff44d..66b10e6ea9dd5bc12c3278bee7750a9efe83c855 100644
--- a/gcc/testsuite/gfortran.dg/common_align_2.f90
+++ b/gcc/testsuite/gfortran.dg/common_align_2.f90
@@ -1,6 +1,6 @@
 ! { dg-do run }
 ! { dg-options "-pedantic-errors -mdalign" { target sh*-*-* } }
-! Tests the fix for PR37614, in which the alignement of commons followed
+! Tests the fix for PR37614, in which the alignment of commons followed
 ! g77 rather than the standard or other compilers.
 !
 ! Contributed by Tobias Burnus  
diff --git a/gcc/tree.h b/gcc/tree.h
index 90413fcf2090043619e8bf190c11c066f63caa74..012fa542cf302b6668485e16ae9c7602660bc24a 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1868,7 +1868,7 @@ extern machine_mode element_mode (const_tree t);
 
 /* The alignment necessary for objects of this type.
The value is an int, measured in bits and must be a power of two.
-   We support also an "alignement" of zero.  */
+   We support also an "alignment" of zero.  */
 #define TYPE_ALIGN(NODE) \
 (TYPE_CHECK (NODE)->type_common.align \
  ? ((unsigned)1) << ((NODE)->type_common.align - 1) : 0)


Re: [PATCH PR71347][Partial revert r235513]Compute cost for all uses in group

2016-06-20 Thread Andreas Schwab
"Bin.Cheng"  writes:

>>> The test passes on aarch64, but fails on arm targets. Maybe that's
>>> easier for Bin to reproduce?
>> Hi all,
>> Sorry for the inconvenience, I will have a look at the two targets.
> Hmm, the failure is because post-increment is enabled in IVOPT on both
> ia64 and arm.  As a result, IVOPT tends to choose iv_cand which is
> incremented after the first store.  The dump for IVOPT is as:

FWIW, the test also fails on m68k, for the same reason.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[Patch, testsuite] Mark some more tests as UNSUPPORTED for avr

2016-06-20 Thread Senthil Kumar Selvaraj
Hi,

 This patch fixes some bogus failures for the avr target by requiring
 int32plus or ptr32plus support.

 Ok for trunk?

Regards
Senthil


gcc/testsuite/ChangeLog:

2016-06-20  Senthil Kumar Selvaraj  

* c-c++-common/pr68657-1.c: Require ptr32plus support.
* c-c++-common/pr68657-2.c: Likewise.
* c-c++-common/pr68657-3.c: Likewise.
* gcc.dg/torture/pr69714.c: Require int32plus support.
* gcc.dg/torture/pr70025.c: Likewise.
* gcc.dg/torture/pr70083.c: Likewise.
* gcc.dg/torture/pr70542.c: Likewise.
* gcc.dg/torture/pr70935.c: Require ptr32plus support.


diff --git gcc/testsuite/c-c++-common/pr68657-1.c 
gcc/testsuite/c-c++-common/pr68657-1.c
index 3db6f49..84f3e54 100644
--- gcc/testsuite/c-c++-common/pr68657-1.c
+++ gcc/testsuite/c-c++-common/pr68657-1.c
@@ -1,5 +1,6 @@
 /* PR c/68657 */
 /* { dg-options "-Werror=sign-conversion -Werror=float-conversion 
-Werror=frame-larger-than=65536" } */
+/* { dg-require-effective-target ptr32plus } */
 
 void
 f1 (void)
diff --git gcc/testsuite/c-c++-common/pr68657-2.c 
gcc/testsuite/c-c++-common/pr68657-2.c
index 9eb68ce..1391088 100644
--- gcc/testsuite/c-c++-common/pr68657-2.c
+++ gcc/testsuite/c-c++-common/pr68657-2.c
@@ -1,6 +1,7 @@
 /* PR c/68657 */
 /* { dg-do compile } */
 /* { dg-options "-Werror=larger-than=65536" } */
+/* { dg-require-effective-target ptr32plus } */
 
 int a[131072]; /* { dg-error "size of 'a' is \[1-9]\[0-9]* bytes" } */
 int b[1024];   /* { dg-bogus "size of 'b' is \[1-9]\[0-9]* bytes" } */
diff --git gcc/testsuite/c-c++-common/pr68657-3.c 
gcc/testsuite/c-c++-common/pr68657-3.c
index 84622fc..1e80c5b 100644
--- gcc/testsuite/c-c++-common/pr68657-3.c
+++ gcc/testsuite/c-c++-common/pr68657-3.c
@@ -1,5 +1,6 @@
 /* PR c/68657 */
 /* { dg-do compile } */
+/* { dg-require-effective-target ptr32plus } */
 
 #pragma GCC diagnostic error "-Wlarger-than=65536"
 int a[131072]; /* { dg-error "size of 'a' is \[1-9]\[0-9]* bytes" } */
diff --git gcc/testsuite/gcc.dg/torture/pr69714.c 
gcc/testsuite/gcc.dg/torture/pr69714.c
index 229b7ad..85de8be 100644
--- gcc/testsuite/gcc.dg/torture/pr69714.c
+++ gcc/testsuite/gcc.dg/torture/pr69714.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fno-strict-aliasing" } */
+/* { dg-require-effective-target int32plus } */
 
 #include 
 #include 
diff --git gcc/testsuite/gcc.dg/torture/pr70025.c 
gcc/testsuite/gcc.dg/torture/pr70025.c
index dafae0b..6c43a0a 100644
--- gcc/testsuite/gcc.dg/torture/pr70025.c
+++ gcc/testsuite/gcc.dg/torture/pr70025.c
@@ -1,6 +1,7 @@
 /* PR middle-end/70025 */
 /* { dg-do run } */
 /* { dg-additional-options "-mtune=z10" { target s390*-*-* } } */
+/* { dg-require-effective-target int32plus } */
 
 typedef char (*F) (unsigned long, void *);
 typedef union { struct A { char a1, a2, a3, a4; unsigned long a5; F a6; void 
*a7; } b; char c[1]; } B;
diff --git gcc/testsuite/gcc.dg/torture/pr70083.c 
gcc/testsuite/gcc.dg/torture/pr70083.c
index 0cf2892..f33cb74 100644
--- gcc/testsuite/gcc.dg/torture/pr70083.c
+++ gcc/testsuite/gcc.dg/torture/pr70083.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-Wno-psabi" } */
+/* { dg-require-effective-target int32plus } */
 
 typedef short v16hi __attribute__ ((vector_size (32)));
 typedef int v8si __attribute__ ((vector_size (32)));
diff --git gcc/testsuite/gcc.dg/torture/pr70542.c 
gcc/testsuite/gcc.dg/torture/pr70542.c
index ed7ab9d..39c7f81 100644
--- gcc/testsuite/gcc.dg/torture/pr70542.c
+++ gcc/testsuite/gcc.dg/torture/pr70542.c
@@ -1,5 +1,6 @@
 /* PR rtl-optimization/70542 */
 /* { dg-do run } */
+/* { dg-require-effective-target int32plus } */
 
 int a[113], d[113];
 short b[113], c[113], e[113];
diff --git gcc/testsuite/gcc.dg/torture/pr70935.c 
gcc/testsuite/gcc.dg/torture/pr70935.c
index eb7f034..f1dd9e4 100644
--- gcc/testsuite/gcc.dg/torture/pr70935.c
+++ gcc/testsuite/gcc.dg/torture/pr70935.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -g" } */
+/* { dg-require-effective-target ptr32plus } */
 
 int d0, sj, v0, rp, zi;
 
-- 
2.7.4



Re: [PATCH PR71347][Partial revert r235513]Compute cost for all uses in group

2016-06-20 Thread Bin.Cheng
On Mon, Jun 20, 2016 at 9:20 AM, Bin.Cheng  wrote:
> On Mon, Jun 20, 2016 at 9:18 AM, Christophe Lyon
>  wrote:
>> On 18 June 2016 at 10:59, Andreas Schwab  wrote:
>>> Bin Cheng  writes:
>>>
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c 
 b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
 new file mode 100644
 index 000..7e5ad49
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
 @@ -0,0 +1,17 @@
 +/* { dg-do compile } */
 +/* { dg-options "-O2 -fdump-tree-optimized" } */
 +
 +double in;
 +extern void Write (double);
 +void foo (void)
 +{
 +  static double X[9];
 +  int i;
 +X[1] = in * in;
 +for (i = 2; i <= 8; i++)
 +X[i] = X[i - 1] * X[1];
 +Write (X[5]);
 +}
 +
 +/* Load of X[i - i] can be omitted by reusing X[i] in previous iteration. 
  */
 +/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized"} } */
>>>
>>> The test fails on ia64, this is what I get in .optimized:
>>>
>>
>> The test passes on aarch64, but fails on arm targets. Maybe that's
>> easier for Bin to reproduce?
> Hi all,
> Sorry for the inconvenience, I will have a look at the two targets.
Hmm, the failure is because post-increment is enabled in IVOPT on both
ia64 and arm.  As a result, IVOPT tends to choose iv_cand which is
incremented after the first store.  The dump for IVOPT is as:


  :
  # prephitmp_20 = PHI 
  # prephitmp_22 = PHI 
  # ivtmp.23_16 = PHI 
  _6 = prephitmp_20 * prephitmp_22;
  _5 = (void *) ivtmp.23_16;
  MEM[base: _5, offset: 0B] = _6;
  ivtmp.23_8 = ivtmp.23_16 + 8;
  if (ivtmp.23_8 != _27)
goto ;
  else
goto ;

  :
  _24 = (void *) ivtmp.23_8;
  _25 = _24 + 18446744073709551608;
  pretmp_15 = MEM[base: _25, offset: 0B];
  pretmp_21 = X[1];
  goto ;

Note address expressions of the load and store now are of different
forms, though of the same value.  I will look into DOM to see if it
can be improved to handle address expressions in different forms.
Also I believe this case is long time failed before it was introduced,
I will mark it XFAIL for the moment.

Thanks,
bin
>
> Thanks,
> bin
>>
>> Christophe
>>
>>> ;; Function foo (foo, funcdef_no=0, decl_uid=1387, cgraph_uid=0, 
>>> symbol_order=1)
>>>
>>> foo ()
>>> {
>>>   int i;
>>>   static double X[9];
>>>   double in.0_1;
>>>   double in.1_2;
>>>   double _3;
>>>   int _4;
>>>   double _5;
>>>   double _6;
>>>   double _7;
>>>   double _8;
>>>
>>>   :
>>>   in.0_1 = in;
>>>   in.1_2 = in;
>>>   _3 = in.0_1 * in.1_2;
>>>   X[1] = _3;
>>>   i_13 = 2;
>>>   goto ;
>>>
>>>   :
>>>   _4 = i_9 + -1;
>>>   _5 = X[_4];
>>>   _6 = X[1];
>>>   _7 = _5 * _6;
>>>   X[i_9] = _7;
>>>   i_15 = i_9 + 1;
>>>
>>>   :
>>>   # i_9 = PHI 
>>>   if (i_9 <= 8)
>>> goto ;
>>>   else
>>> goto ;
>>>
>>>   :
>>>   _8 = X[5];
>>>   Write (_8);
>>>   return;
>>>
>>> }
>>>
>>>
>>> Andreas.
>>>
>>> --
>>> Andreas Schwab, sch...@linux-m68k.org
>>> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
>>> "And now for something completely different."


Re: [PATCH] Change PRED_LOOP_EXIT from 92 to 85.

2016-06-20 Thread Martin Liška
On 06/18/2016 06:24 AM, Andrew Pinski wrote:
> This caused a 1% decrease of performance on coremarks on
> aarch64-linux-gnu on ThunderX.
> 
> Thanks,
> Andrew

Hi.

It would be good if you will run your benchmark with -fprofile-generate and
-fprofile-use -fdump-ipa-profile-details. After that, running 
contrib/analyze_brprob.py
on the concatenated *.profile dumps will display predictor results, maybe it 
can provide
interesting numbers, compared to numbers get on CPU2006.

Thanks,
Martin



Re: [PATCH PR71347][Partial revert r235513]Compute cost for all uses in group

2016-06-20 Thread Bin.Cheng
On Mon, Jun 20, 2016 at 9:18 AM, Christophe Lyon
 wrote:
> On 18 June 2016 at 10:59, Andreas Schwab  wrote:
>> Bin Cheng  writes:
>>
>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c 
>>> b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>>> new file mode 100644
>>> index 000..7e5ad49
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>>> @@ -0,0 +1,17 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>> +
>>> +double in;
>>> +extern void Write (double);
>>> +void foo (void)
>>> +{
>>> +  static double X[9];
>>> +  int i;
>>> +X[1] = in * in;
>>> +for (i = 2; i <= 8; i++)
>>> +X[i] = X[i - 1] * X[1];
>>> +Write (X[5]);
>>> +}
>>> +
>>> +/* Load of X[i - i] can be omitted by reusing X[i] in previous iteration.  
>>> */
>>> +/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized"} } */
>>
>> The test fails on ia64, this is what I get in .optimized:
>>
>
> The test passes on aarch64, but fails on arm targets. Maybe that's
> easier for Bin to reproduce?
Hi all,
Sorry for the inconvenience, I will have a look at the two targets.

Thanks,
bin
>
> Christophe
>
>> ;; Function foo (foo, funcdef_no=0, decl_uid=1387, cgraph_uid=0, 
>> symbol_order=1)
>>
>> foo ()
>> {
>>   int i;
>>   static double X[9];
>>   double in.0_1;
>>   double in.1_2;
>>   double _3;
>>   int _4;
>>   double _5;
>>   double _6;
>>   double _7;
>>   double _8;
>>
>>   :
>>   in.0_1 = in;
>>   in.1_2 = in;
>>   _3 = in.0_1 * in.1_2;
>>   X[1] = _3;
>>   i_13 = 2;
>>   goto ;
>>
>>   :
>>   _4 = i_9 + -1;
>>   _5 = X[_4];
>>   _6 = X[1];
>>   _7 = _5 * _6;
>>   X[i_9] = _7;
>>   i_15 = i_9 + 1;
>>
>>   :
>>   # i_9 = PHI 
>>   if (i_9 <= 8)
>> goto ;
>>   else
>> goto ;
>>
>>   :
>>   _8 = X[5];
>>   Write (_8);
>>   return;
>>
>> }
>>
>>
>> Andreas.
>>
>> --
>> Andreas Schwab, sch...@linux-m68k.org
>> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
>> "And now for something completely different."


Re: [PATCH PR71347][Partial revert r235513]Compute cost for all uses in group

2016-06-20 Thread Christophe Lyon
On 18 June 2016 at 10:59, Andreas Schwab  wrote:
> Bin Cheng  writes:
>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>> new file mode 100644
>> index 000..7e5ad49
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +double in;
>> +extern void Write (double);
>> +void foo (void)
>> +{
>> +  static double X[9];
>> +  int i;
>> +X[1] = in * in;
>> +for (i = 2; i <= 8; i++)
>> +X[i] = X[i - 1] * X[1];
>> +Write (X[5]);
>> +}
>> +
>> +/* Load of X[i - i] can be omitted by reusing X[i] in previous iteration.  
>> */
>> +/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized"} } */
>
> The test fails on ia64, this is what I get in .optimized:
>

The test passes on aarch64, but fails on arm targets. Maybe that's
easier for Bin to reproduce?

Christophe

> ;; Function foo (foo, funcdef_no=0, decl_uid=1387, cgraph_uid=0, 
> symbol_order=1)
>
> foo ()
> {
>   int i;
>   static double X[9];
>   double in.0_1;
>   double in.1_2;
>   double _3;
>   int _4;
>   double _5;
>   double _6;
>   double _7;
>   double _8;
>
>   :
>   in.0_1 = in;
>   in.1_2 = in;
>   _3 = in.0_1 * in.1_2;
>   X[1] = _3;
>   i_13 = 2;
>   goto ;
>
>   :
>   _4 = i_9 + -1;
>   _5 = X[_4];
>   _6 = X[1];
>   _7 = _5 * _6;
>   X[i_9] = _7;
>   i_15 = i_9 + 1;
>
>   :
>   # i_9 = PHI 
>   if (i_9 <= 8)
> goto ;
>   else
> goto ;
>
>   :
>   _8 = X[5];
>   Write (_8);
>   return;
>
> }
>
>
> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


[PATCH 5/7] remove m32-rtems support

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove m32r-rtems support.

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove m32r-rtems support.
* config/m32r/rtems.h: Remove.

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: Stop testing m32r-rtems.
---
 contrib/config-list.mk  |  1 -
 gcc/config.gcc  |  5 -
 gcc/config/m32r/rtems.h | 33 -
 libgcc/config.host  |  4 
 4 files changed, 43 deletions(-)
 delete mode 100644 gcc/config/m32r/rtems.h

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 5ea6d6c..ca20c98 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -55,7 +55,6 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   i686-cygwinOPT-enable-threads=yes i686-mingw32crt ia64-elf \
   ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
   lm32-rtems lm32-uclinux m32c-rtems m32c-elf m32r-elf m32rle-elf \
-  m32r-rtemsOPT-enable-obsolete \
   m32r-linux m32rle-linux m68k-elf m68k-netbsdelf \
   m68k-openbsd m68k-uclinux m68k-linux m68k-rtems \
   mcore-elf mep-elfOPT-enable-obsolete microblaze-linux microblaze-elf \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index cb2923e..c189f59 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -237,7 +237,6 @@ md_file=
 # Obsolete configurations.
 case ${target} in
  avr-*rtems*   \
- | m32r-*rtems*\
  | mep-*   \
  )
 if test "x$enable_obsolete" != xyes; then
@@ -1829,10 +1828,6 @@ m32r-*-elf*)
 m32rle-*-elf*)
tm_file="dbxelf.h elfos.h newlib-stdint.h m32r/little.h ${tm_file}"
;;
-m32r-*-rtems*)
-   tm_file="dbxelf.h elfos.h ${tm_file} m32r/rtems.h rtems.h 
newlib-stdint.h"
-   tmake_file="${tmake_file} m32r/t-m32r"
-   ;;
 m32r-*-linux*)
tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
m32r/linux.h"
tmake_file="${tmake_file} m32r/t-linux t-slibgcc"
diff --git a/gcc/config/m32r/rtems.h b/gcc/config/m32r/rtems.h
deleted file mode 100644
index 839b4e0..000
--- a/gcc/config/m32r/rtems.h
+++ /dev/null
@@ -1,33 +0,0 @@
-/* Definitions for rtems targeting a M32R using ELF.
-   Copyright (C) 2009-2016 Free Software Foundation, Inc.
-   Contributed by Joel Sherrill (j...@oarcorp.com).
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-.  */
-
-/* Target OS builtins.  */
-#undef TARGET_OS_CPP_BUILTINS
-#define TARGET_OS_CPP_BUILTINS()   \
-  do   \
-{  \
-   builtin_define ("__rtems__");   \
-   builtin_define ("__USE_INIT_FINI__");   \
-   builtin_assert ("system=rtems");\
-}  \
-  while (0)
-
-/* Use the default */
-#undef LINK_GCC_C_SEQUENCE_SPEC
diff --git a/libgcc/config.host b/libgcc/config.host
index 319810f..3f8d0a8 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -790,10 +790,6 @@ lm32-*-uclinux*)
 m32r-*-elf*)
tmake_file=t-fdpbit
;;
-m32r-*-rtems*)
-   tmake_file="$tmake_file m32r/t-m32r t-fdpbit"
-   extra_parts="$extra_parts crtinit.o crtfini.o"
-   ;;
 m32rle-*-elf*)
tmake_file=t-fdpbit
;;
-- 
2.7.0



[PATCH 6/7] remove avr-rtems support

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: Stop testing avr-rtems.

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove support for avr-rtems.
* config/avr/t-rtems: Remove.

ChangeLog:

2016-06-20  Trevor Saunders  

* configure: Regenerate.
* configure.ac: Remove support for avr-rtems.

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove support for avr-rtems.
* config/avr/gen-avr-mmcu-specs.c: Likewise.
* config/avr/rtems.h: Remove.
* config/avr/t-rtems: Remove.

contrib/header-tools/ChangeLog:

2016-06-20  Trevor Saunders  

* README: Remove references to avr-rtems.
* reduce-headers: Likewise.
---
 configure   |  4 +---
 configure.ac|  2 --
 contrib/config-list.mk  |  2 +-
 contrib/header-tools/README |  2 +-
 contrib/header-tools/reduce-headers |  1 -
 gcc/config.gcc  | 10 +-
 gcc/config/avr/gen-avr-mmcu-specs.c | 12 +---
 gcc/config/avr/rtems.h  | 27 ---
 gcc/config/avr/t-rtems  |  3 ---
 libgcc/config.host  |  6 --
 libgcc/config/avr/t-rtems   |  2 --
 11 files changed, 5 insertions(+), 66 deletions(-)
 delete mode 100644 gcc/config/avr/rtems.h
 delete mode 100644 gcc/config/avr/t-rtems
 delete mode 100644 libgcc/config/avr/t-rtems

diff --git a/configure b/configure
index ea63784..04cb999 100755
--- a/configure
+++ b/configure
@@ -3762,8 +3762,6 @@ case "${target}" in
   arm-*-riscix*)
 noconfigdirs="$noconfigdirs ld target-libgloss"
 ;;
-  avr-*-rtems*)
-;;
   avr-*-*)
 if test x${with_avrlibc} != xno; then
   noconfigdirs="$noconfigdirs target-newlib target-libgloss"
@@ -6128,7 +6126,7 @@ target_elf=no
 case $target in
   *-darwin* | *-aix* | *-cygwin* | *-mingw* | *-aout* | *-*coff* | \
   *-msdosdjgpp* | *-vms* | *-wince* | *-*-pe* | \
-  alpha*-dec-osf* | *-interix* | hppa[12]*-*-hpux* | \
+  alpha*-dec-osf* | hppa[12]*-*-hpux* | \
   nvptx-*-none)
 target_elf=no
 ;;
diff --git a/configure.ac b/configure.ac
index 54558df..4031ac6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1098,8 +1098,6 @@ case "${target}" in
   arm-*-riscix*)
 noconfigdirs="$noconfigdirs ld target-libgloss"
 ;;
-  avr-*-rtems*)
-;;
   avr-*-*)
 if test x${with_avrlibc} != xno; then
   noconfigdirs="$noconfigdirs target-newlib target-libgloss"
diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index ca20c98..dfebcee 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -37,7 +37,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   arc-linux-uclibcOPT-with-cpu=arc700 arceb-linux-uclibcOPT-with-cpu=arc700 \
   arm-wrs-vxworks arm-netbsdelf \
   arm-linux-androideabi arm-uclinux_eabi arm-eabi arm-rtems \
-  arm-symbianelf avr-rtemsOPT-enable-obsolete avr-elf \
+  arm-symbianelf avr-elf \
   bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
   c6x-elf c6x-uclinux cr16-elf cris-elf cris-linux crisv32-elf crisv32-linux \
   epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
diff --git a/contrib/header-tools/README b/contrib/header-tools/README
index 05d3b97..3b20e51 100644
--- a/contrib/header-tools/README
+++ b/contrib/header-tools/README
@@ -203,7 +203,7 @@ reduce-headers
   these targets.  They are also known to the tool.  When building targets it
   will check those targets before the rest.  
   This coverage can be achieved by building config-list.mk with :
-  LIST="aarch64-linux-gnu arm-netbsdelf avr-rtems c6x-elf epiphany-elf 
hppa2.0-hpux10.1 i686-mingw32crt i686-pc-msdosdjgpp mipsel-elf 
powerpc-eabisimaltivec rs6000-ibm-aix5.1.0 sh-superh-elf sparc64-elf spu-elf"
+  LIST="aarch64-linux-gnu arm-netbsdelf c6x-elf epiphany-elf hppa2.0-hpux10.1 
i686-mingw32crt i686-pc-msdosdjgpp mipsel-elf powerpc-eabisimaltivec 
rs6000-ibm-aix5.1.0 sh-superh-elf sparc64-elf spu-elf"
 
   -b specifies the native bootstrapped build root directory
   -t specifies a target build root directory that config-list.mk was run from
diff --git a/contrib/header-tools/reduce-headers 
b/contrib/header-tools/reduce-headers
index e4f4d7b..26a7df9 100755
--- a/contrib/header-tools/reduce-headers
+++ b/contrib/header-tools/reduce-headers
@@ -23,7 +23,6 @@ no_remove = [ "system.h", "coretypes.h", "config.h" , 
"bconfig.h", "backend.h" ]
 target_priority = [
 "aarch64-linux-gnu",
 "arm-netbsdelf",
-"avr-rtems",
 "c6x-elf",
 "epiphany-elf",
 "hppa2.0-hpux10.1",
diff --git a/gcc/config.gcc b/gcc/config.gcc
index c189f59..612a333 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -236,8 +236,7 @@ md_file=
 
 # Obsolete 

[PATCH 3/7] remove knetbsd support

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove support for knetbsd.
* configure.ac: Likewise.
* config/i386/knetbsd-gnu.h: Remove.  * config/i386/knetbsd-gnu64.h: 
Remove.
* config/knetbsd-gnu.h: Remove.
* configure: Regenerate.

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove support for knetbsd.

libstdc++-v3/ChangeLog:

2016-06-20  Trevor Saunders  

* configure: Regenerate.
* configure.host: Remove support for knetbsd.
* crossconfig.m4: Likewise.

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: stop testing knetbsd.
---
 contrib/config-list.mk  |  4 ++--
 gcc/config.gcc  | 17 +
 gcc/config/i386/knetbsd-gnu.h   | 21 -
 gcc/config/i386/knetbsd-gnu64.h | 26 --
 gcc/config/knetbsd-gnu.h| 35 ---
 gcc/configure   | 10 +++---
 gcc/configure.ac|  2 +-
 libgcc/config.host  |  7 +++
 libstdc++-v3/configure  |  2 +-
 libstdc++-v3/configure.host |  2 +-
 libstdc++-v3/crossconfig.m4 |  2 +-
 11 files changed, 17 insertions(+), 111 deletions(-)
 delete mode 100644 gcc/config/i386/knetbsd-gnu.h
 delete mode 100644 gcc/config/i386/knetbsd-gnu64.h
 delete mode 100644 gcc/config/knetbsd-gnu.h

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index a437ece..2fe90bb 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -48,7 +48,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   hppa64-hpux11.0OPT-enable-sjlj-exceptions=yes hppa2.0-hpux11.9 \
   i686-pc-linux-gnu i686-apple-darwin i686-apple-darwin9 i686-apple-darwin10 \
   i486-freebsd4 i686-freebsd6 i686-kfreebsd-gnu \
-  i686-netbsdelf9 i686-knetbsd-gnuOPT-enable-obsolete \
+  i686-netbsdelf9 \
   i686-openbsd i686-elf i686-kopensolaris-gnu i686-symbolics-gnu \
   i686-pc-msdosdjgpp i686-lynxos i686-nto-qnx \
   i686-rtems i686-solaris2.10 i686-wrs-vxworks \
@@ -95,7 +95,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   vax-netbsdelf vax-openbsd visium-elf x86_64-apple-darwin \
   x86_64-pc-linux-gnuOPT-with-fpmath=avx \
   x86_64-elfOPT-with-fpmath=sse x86_64-freebsd6 x86_64-netbsd \
-  x86_64-knetbsd-gnuOPT-enable-obsolete x86_64-w64-mingw32 \
+  x86_64-w64-mingw32 \
   x86_64-mingw32OPT-enable-sjlj-exceptions=yes x86_64-rtems \
   xstormy16-elf xtensa-elf \
   xtensa-linux
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7b091fb..669cb9f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -236,8 +236,7 @@ md_file=
 
 # Obsolete configurations.
 case ${target} in
- *-knetbsd-*   \
- | avr-*rtems* \
+ avr-*rtems*   \
  | h8300-*rtems*   \
  | m32r-*rtems*\
  | mep-*   \
@@ -707,7 +706,7 @@ case ${target} in
   esac
   use_gcc_stdint=wrap
   ;;
-*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-knetbsd*-gnu | *-*-gnu* | 
*-*-kopensolaris*-gnu)
+*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-gnu* | 
*-*-kopensolaris*-gnu)
   extra_options="$extra_options gnu-user.opt"
   gas=yes
   gnu_ld=yes
@@ -716,7 +715,7 @@ case ${target} in
   esac
   tmake_file="t-slibgcc"
   case $target in
-*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-knetbsd*-gnu | 
*-*-kopensolaris*-gnu)
+*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu)
   :;;
 *-*-gnu*)
   native_system_header_dir=/include
@@ -1459,7 +1458,7 @@ x86_64-*-openbsd*)
gas=yes
gnu_ld=yes
;;
-i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | 
i[34567]86-*-gnu* | i[34567]86-*-kopensolaris*-gnu)
+i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-gnu* | 
i[34567]86-*-kopensolaris*-gnu)
# Intel 80386's running GNU/*
# with ELF format using glibc 2
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h gnu-user.h 
glibc-stdint.h"
@@ -1515,9 +1514,6 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
i[34567]86-*-knetbsd*-gnu | i
tm_file="${tm_file} i386/gnu-user-common.h 
i386/gnu-user.h i386/linux-common.h i386/linux.h"
fi
;;
-   i[34567]86-*-knetbsd*-gnu)
-   tm_file="${tm_file} i386/gnu-user-common.h i386/gnu-user.h 
knetbsd-gnu.h i386/knetbsd-gnu.h"
-   ;;
i[34567]86-*-kfreebsd*-gnu)
tm_file="${tm_file} i386/gnu-user-common.h i386/gnu-user.h 
kfreebsd-gnu.h i386/kfreebsd-gnu.h"
;;
@@ -1529,7 +1525,7 @@ i[34567]86-*-linux* | 

[PATCH 4/7] remove h8300-rtems support

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: Remove h8300-rtems support.

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove h8300-rtems support.

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove h8300-rtems support.
* config/h8300/rtems.h: Remove.
* config/h8300/t-rtems: Remove.
---
 contrib/config-list.mk   |  3 +--
 gcc/config.gcc   |  5 -
 gcc/config/h8300/rtems.h | 29 -
 gcc/config/h8300/t-rtems |  7 ---
 libgcc/config.host   |  5 -
 5 files changed, 1 insertion(+), 48 deletions(-)
 delete mode 100644 gcc/config/h8300/rtems.h
 delete mode 100644 gcc/config/h8300/t-rtems

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 2fe90bb..5ea6d6c 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -41,8 +41,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
   c6x-elf c6x-uclinux cr16-elf cris-elf cris-linux crisv32-elf crisv32-linux \
   epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
-  fr30-elf frv-elf frv-linux ft32-elf h8300-elf \
-  h8300-rtemsOPT-enable-obsolete hppa-linux-gnu \
+  fr30-elf frv-elf frv-linux ft32-elf h8300-elf hppa-linux-gnu \
   hppa-linux-gnuOPT-enable-sjlj-exceptions=yes hppa64-linux-gnu \
   hppa2.0-hpux10.1 hppa64-hpux11.3 \
   hppa64-hpux11.0OPT-enable-sjlj-exceptions=yes hppa2.0-hpux11.9 \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 669cb9f..cb2923e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -237,7 +237,6 @@ md_file=
 # Obsolete configurations.
 case ${target} in
  avr-*rtems*   \
- | h8300-*rtems*   \
  | m32r-*rtems*\
  | mep-*   \
  )
@@ -1239,10 +1238,6 @@ moxie-*-moxiebox*)
tm_file="${tm_file} dbxelf.h elfos.h moxie/moxiebox.h newlib-stdint.h"
tmake_file="${tmake_file} moxie/t-moxiebox"
;;
-h8300-*-rtems*)
-   tmake_file="${tmake_file} h8300/t-h8300 h8300/t-rtems"
-   tm_file="h8300/h8300.h dbxelf.h elfos.h h8300/elf.h h8300/rtems.h 
rtems.h newlib-stdint.h"
-   ;;
 h8300-*-elf*)
tmake_file="h8300/t-h8300"
tm_file="h8300/h8300.h dbxelf.h elfos.h newlib-stdint.h h8300/elf.h"
diff --git a/gcc/config/h8300/rtems.h b/gcc/config/h8300/rtems.h
deleted file mode 100644
index eb16f1d..000
--- a/gcc/config/h8300/rtems.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/* Definitions for rtems targeting a H8
-   Copyright (C) 1996-2016 Free Software Foundation, Inc.
-   Contributed by Joel Sherrill (j...@oarcorp.com).
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-.  */
-
-/* Target OS preprocessor built-ins.  */
-#define TARGET_OS_CPP_BUILTINS()   \
-  do   \
-{  \
-  builtin_define_std ("h8300");\
-  builtin_define ("__rtems__");\
-  builtin_assert ("system=rtems"); \
-}  \
-  while (0)
diff --git a/gcc/config/h8300/t-rtems b/gcc/config/h8300/t-rtems
deleted file mode 100644
index 0d76437..000
--- a/gcc/config/h8300/t-rtems
+++ /dev/null
@@ -1,7 +0,0 @@
-# Custom multilibs for RTEMS
-
-# -mn is not applicable to RTEMS (-mn implies 16bit void*)
-
-MULTILIB_OPTIONS = mh/ms/msx mint32
-MULTILIB_DIRNAMES = h8300h h8300s h8sx int32
-MULTILIB_EXCEPTIONS = mint32
diff --git a/libgcc/config.host b/libgcc/config.host
index 12b69cf..319810f 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -495,11 +495,6 @@ ft32-*-elf)
tmake_file="ft32/t-ft32 t-softfp-sfdf t-softfp-excl t-softfp"
extra_parts="$extra_parts crti.o crti-hw.o crtn.o"
;;
-h8300-*-rtems*)
-   tmake_file="$tmake_file h8300/t-h8300 t-fpbit"
-   tm_file="$tm_file h8300/h8300-lib.h"
-   extra_parts="$extra_parts crti.o crtn.o"
-   ;;
 h8300-*-elf*)
tmake_file="$tmake_file h8300/t-h8300 t-fpbit"
tm_file="$tm_file h8300/h8300-lib.h"
-- 
2.7.0



[PATCH 2/7] remove support for targeting openbsd 2 or 3

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: Stop testing openbsd3.0.

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove support for openbsd 2 and 3.

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove support for openbsd 2 and 3.
* config/openbsd-oldgas.h: Remove.
---
 contrib/config-list.mk  |  5 ++---
 gcc/config.gcc  | 14 --
 gcc/config/openbsd-oldgas.h | 26 --
 libgcc/config.host  |  2 --
 4 files changed, 2 insertions(+), 45 deletions(-)
 delete mode 100644 gcc/config/openbsd-oldgas.h

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 832403a..a437ece 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -49,9 +49,8 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   i686-pc-linux-gnu i686-apple-darwin i686-apple-darwin9 i686-apple-darwin10 \
   i486-freebsd4 i686-freebsd6 i686-kfreebsd-gnu \
   i686-netbsdelf9 i686-knetbsd-gnuOPT-enable-obsolete \
-  i686-openbsd i686-openbsd3.0OPT-enable-obsolete \
-  i686-elf i686-kopensolaris-gnu i686-symbolics-gnu i686-pc-msdosdjgpp \
-  i686-lynxos i686-nto-qnx \
+  i686-openbsd i686-elf i686-kopensolaris-gnu i686-symbolics-gnu \
+  i686-pc-msdosdjgpp i686-lynxos i686-nto-qnx \
   i686-rtems i686-solaris2.10 i686-wrs-vxworks \
   i686-wrs-vxworksae \
   i686-cygwinOPT-enable-threads=yes i686-mingw32crt ia64-elf \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 34da23e..7b091fb 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -237,8 +237,6 @@ md_file=
 # Obsolete configurations.
 case ${target} in
  *-knetbsd-*   \
- | *-openbsd2* \
- | *-openbsd3* \
  | avr-*rtems* \
  | h8300-*rtems*   \
  | m32r-*rtems*\
@@ -805,10 +803,6 @@ case ${target} in
   ;;
   esac
   case ${target} in
-*-*-openbsd2.*|*-*-openbsd3.[012])
-  tm_defines="${tm_defines} HAS_LIBC_R=1" ;;
-  esac
-  case ${target} in
 *-*-openbsd4.[3-9]|*-*-openbsd[5-9]*)
   default_use_cxa_atexit=yes
   ;;
@@ -1451,14 +1445,6 @@ x86_64-*-netbsd*)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h netbsd.h 
netbsd-elf.h i386/x86-64.h i386/netbsd64.h"
extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
;;
-i[34567]86-*-openbsd2.*|i[34567]86-*openbsd3.[0123])
-   tm_file="i386/i386.h i386/unix.h i386/bsd.h i386/gas.h i386/gstabs.h 
openbsd-oldgas.h openbsd.h i386/openbsd.h"
-   extra_options="${extra_options} openbsd.opt"
-   # needed to unconfuse gdb
-   tmake_file="${tmake_file} t-openbsd i386/t-openbsd"
-   # we need collect2 until our bug is fixed...
-   use_collect2=yes
-   ;;
 i[34567]86-*-openbsd*)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h"
tm_file="${tm_file} openbsd.h openbsd-stdint.h openbsd-libpthread.h 
i386/openbsdelf.h"
diff --git a/gcc/config/openbsd-oldgas.h b/gcc/config/openbsd-oldgas.h
deleted file mode 100644
index 34e88bf..000
--- a/gcc/config/openbsd-oldgas.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/* Generic settings for a.out OpenBSD systems.
-   Copyright (C) 2002-2016 Free Software Foundation, Inc.
-   Contributed by David E. O'Brien .
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-.  */
-
-
-#define OBSD_OLD_GAS
-
-/* OpenBSD3.0 had no libpthread, pthreads lived in -lc_r */
-#define OBSD_LIB_SPEC "%{!shared:-lc%{pthread:_r}}"
-
diff --git a/libgcc/config.host b/libgcc/config.host
index 2f55fbd..0d9bb0d 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -603,8 +603,6 @@ i[34567]86-*-netbsdelf*)
 x86_64-*-netbsd*)
tmake_file="${tmake_file} i386/t-crtstuff"
;;
-i[34567]86-*-openbsd2.*|i[34567]86-*openbsd3.[0123])
-   ;;
 i[34567]86-*-openbsd*)
;;
 x86_64-*-openbsd*)
-- 
2.7.0



[PATCH 1/7] remove support for the interix target

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

contrib/ChangeLog:

2016-06-20  Trevor Saunders  

* config-list.mk: Remove interix target.

libgcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.host: Remove interix support.
* config/i386/t-interix: Remove.

config/ChangeLog:

2016-06-20  Trevor Saunders  

* elf.m4: Remove interix support.
* picflag.m4: Likewise.

fixincludes/ChangeLog:

2016-06-20  Trevor Saunders  

* mkfixinc.sh: Remove interix support.

gcc/ChangeLog:

2016-06-20  Trevor Saunders  

* config.gcc: Remove interix support.
* config/i386/i386-interix.h: Remove.
* config/i386/interix.opt: Remove.
* config/i386/t-interix: Remove.
* configure: Regenerate.
* configure.ac: Remove interix support.
* doc/install.texi: Remove interix documentation.

gcc/testsuite/ChangeLog:

2016-06-20  Trevor Saunders  

* gcc.dg/attr-ms_struct-1.c: Stop testing interix.
* gcc.dg/attr-ms_struct-2.c: Likewise.
* gcc.dg/attr-ms_struct-packed1.c: Likewise.
* gcc.dg/bf-ms-attrib.c: Likewise.
* gcc.dg/bf-ms-layout-2.c: Likewise.
* gcc.dg/bf-ms-layout-3.c: Likewise.
* gcc.dg/bf-ms-layout.c: Likewise.
* gcc.dg/bf-no-ms-layout.c: Likewise.
* gcc.target/i386/bitfield1.c: Likewise.
* gcc.target/i386/bitfield2.c: Likewise.
* gcc.target/i386/bitfield3.c: Likewise.
---
 config/elf.m4 |   2 +-
 config/picflag.m4 |   4 -
 contrib/config-list.mk|   3 +-
 fixincludes/mkfixinc.sh   |   1 -
 gcc/config.gcc|  16 +-
 gcc/config/i386/i386-interix.h| 345 --
 gcc/config/i386/interix.opt   |  34 ---
 gcc/config/i386/t-interix |  30 ---
 gcc/configure |   2 +-
 gcc/configure.ac  |   2 +-
 gcc/doc/install.texi  |  14 --
 gcc/testsuite/gcc.dg/attr-ms_struct-1.c   |   2 +-
 gcc/testsuite/gcc.dg/attr-ms_struct-2.c   |   2 +-
 gcc/testsuite/gcc.dg/attr-ms_struct-packed1.c |   2 +-
 gcc/testsuite/gcc.dg/bf-ms-attrib.c   |   2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout-2.c |   2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout-3.c |   2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout.c   |   2 +-
 gcc/testsuite/gcc.dg/bf-no-ms-layout.c|   2 +-
 gcc/testsuite/gcc.target/i386/bitfield1.c |   2 +-
 gcc/testsuite/gcc.target/i386/bitfield2.c |   2 +-
 gcc/testsuite/gcc.target/i386/bitfield3.c |   1 -
 libgcc/config.host|   3 -
 libgcc/config/i386/t-interix  |   3 -
 24 files changed, 15 insertions(+), 465 deletions(-)
 delete mode 100644 gcc/config/i386/i386-interix.h
 delete mode 100644 gcc/config/i386/interix.opt
 delete mode 100644 gcc/config/i386/t-interix
 delete mode 100644 libgcc/config/i386/t-interix

diff --git a/config/elf.m4 b/config/elf.m4
index 1772a44..5f5cd88 100644
--- a/config/elf.m4
+++ b/config/elf.m4
@@ -17,7 +17,7 @@ target_elf=no
 case $target in
   *-darwin* | *-aix* | *-cygwin* | *-mingw* | *-aout* | *-*coff* | \
   *-msdosdjgpp* | *-vms* | *-wince* | *-*-pe* | \
-  alpha*-dec-osf* | *-interix* | hppa[[12]]*-*-hpux* | \
+  alpha*-dec-osf* | hppa[[12]]*-*-hpux* | \
   nvptx-*-none)
 target_elf=no
 ;;
diff --git a/config/picflag.m4 b/config/picflag.m4
index e0fa343..614421d 100644
--- a/config/picflag.m4
+++ b/config/picflag.m4
@@ -27,10 +27,6 @@ case "${$2}" in
;;
 i[[34567]]86-*-mingw* | x86_64-*-mingw*)
;;
-i[[34567]]86-*-interix[[3-9]]*)
-   # Interix 3.x gcc -fpic/-fPIC options generate broken code.
-   # Instead, we relocate shared libraries at runtime.
-   ;;
 i[[34567]]86-*-nto-qnx*)
# QNX uses GNU C++, but need to define -shared option too, otherwise
# it will coredump.
diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index be41d3c..832403a 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -99,8 +99,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   x86_64-knetbsd-gnuOPT-enable-obsolete x86_64-w64-mingw32 \
   x86_64-mingw32OPT-enable-sjlj-exceptions=yes x86_64-rtems \
   xstormy16-elf xtensa-elf \
-  xtensa-linux \
-  i686-interix3OPT-enable-obsolete
+  xtensa-linux
 
 LOGFILES = $(patsubst %,log/%-make.out,$(LIST))
 all: $(LOGFILES)
diff --git a/fixincludes/mkfixinc.sh b/fixincludes/mkfixinc.sh
index 0d96c8c..0f96486 100755
--- a/fixincludes/mkfixinc.sh
+++ b/fixincludes/mkfixinc.sh
@@ -14,7 +14,6 @@ case $machine in
 i?86-*-cygwin* | \
 i?86-*-mingw32* | \
 x86_64-*-mingw32* | \

[PATCH 0/7] remove targets obsoleted in gcc 6

2016-06-20 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

later than I hoped, but here's the series to remove the targets obsoleted
during gcc 6.

I built and regtested the series as one patch on x86_64-linux-gnu without
regressions, ok?

Trev


Trevor Saunders (7):
  remove support for the interix target
  remove support for targeting openbsd 2 or 3
  remove knetbsd support
  remove h8300-rtems support
  remove m32-rtems support
  remove avr-rtems support
  remove mep-* support

 config/elf.m4  | 2 +-
 config/picflag.m4  | 4 -
 configure  | 4 +-
 configure.ac   | 2 -
 contrib/config-list.mk |20 +-
 contrib/header-tools/README| 2 +-
 contrib/header-tools/reduce-headers| 1 -
 fixincludes/mkfixinc.sh| 1 -
 gcc/common/config/mep/mep-common.c |89 -
 gcc/config.gcc |74 +-
 gcc/config/avr/gen-avr-mmcu-specs.c|12 +-
 gcc/config/avr/rtems.h |27 -
 gcc/config/avr/t-rtems | 3 -
 gcc/config/h8300/rtems.h   |29 -
 gcc/config/h8300/t-rtems   | 7 -
 gcc/config/i386/i386-interix.h |   345 -
 gcc/config/i386/interix.opt|34 -
 gcc/config/i386/knetbsd-gnu.h  |21 -
 gcc/config/i386/knetbsd-gnu64.h|26 -
 gcc/config/i386/t-interix  |30 -
 gcc/config/knetbsd-gnu.h   |35 -
 gcc/config/m32r/rtems.h|33 -
 gcc/config/mep/constraints.md  |   162 -
 gcc/config/mep/default.h   |10 -
 gcc/config/mep/intrinsics.h|   620 -
 gcc/config/mep/intrinsics.md   | 21568 ---
 gcc/config/mep/ivc2-template.h | 9 -
 gcc/config/mep/mep-c5.cpu  |   277 -
 gcc/config/mep/mep-core.cpu|  3080 ---
 gcc/config/mep/mep-default.cpu |25 -
 gcc/config/mep/mep-ext-cop.cpu |23 -
 gcc/config/mep/mep-intrin.h|  8933 
 gcc/config/mep/mep-ivc2.cpu|  9775 -
 gcc/config/mep/mep-pragma.c|   398 -
 gcc/config/mep/mep-protos.h|   128 -
 gcc/config/mep/mep.c   |  7263 ---
 gcc/config/mep/mep.cpu |21 -
 gcc/config/mep/mep.h   |   790 -
 gcc/config/mep/mep.md  |  2254 --
 gcc/config/mep/mep.opt |   164 -
 gcc/config/mep/predicates.md   |   184 -
 gcc/config/mep/t-mep   |68 -
 gcc/config/openbsd-oldgas.h|26 -
 gcc/configure  |12 +-
 gcc/configure.ac   | 4 +-
 gcc/doc/install.texi   |24 -
 gcc/doc/md.texi|   101 -
 gcc/testsuite/gcc.dg/attr-ms_struct-1.c| 2 +-
 gcc/testsuite/gcc.dg/attr-ms_struct-2.c| 2 +-
 gcc/testsuite/gcc.dg/attr-ms_struct-packed1.c  | 2 +-
 gcc/testsuite/gcc.dg/bf-ms-attrib.c| 2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout-2.c  | 2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout-3.c  | 2 +-
 gcc/testsuite/gcc.dg/bf-ms-layout.c| 2 +-
 gcc/testsuite/gcc.dg/bf-no-ms-layout.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c| 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c | 2 +-
 .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c  | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp87.c  | 2 +-
 gcc/testsuite/gcc.target/i386/bitfield1.c  | 2 +-
 gcc/testsuite/gcc.target/i386/bitfield2.c  | 2 +-
 

  1   2   >