date:20140319

[PATCH] Fix PR54733 Optimize endian independent load/store

2014-03-19 Thread Thomas Preud'homme

Hi everybody,

*** Motivation ***

Currently gcc is capable of replacing hand-crafted implementation of byteswap 
by a suitable instruction thanks to the bswap optimization pass. The patch 
proposed here aims at extending this pass to also optimize load in a specific 
endianness, independent of the host endianness.

*** Methodology ***

The patch adds support for dealing with a memory source (array or structure) 
and detect whether the result of a bitwise operation happens to be equivalent 
to a big endian or little endian load and replace it by a load or a load and a 
byteswap according to the host endianness. The original code used the concept 
of symbolic number: a number where the value of each byte indicates its 
position (in terms of weight) before the bitwise manipulation. After performing 
the bit manipulation on that symbolic number, the result tells how the byte 
were shuffled (see variable cmp in function find_bswap). Detecting an operation 
resulting in a number in the host endianness is thus pretty straightforward: 
look if the symbolic number has *not* changed.

As to supporting read from array and structure, there is some logic to 
recognize the base of the array/structure and the offset of entries/fields 
accessed to check if the range of memory accessed would fit in an integer. Each 
entries is initially treated independently and when they are ORed together the 
values in the symbolic number are updated according to the host endianness: the 
entry of higher address would see its values incremented on a little endian 
machine.

Note that as it stands the patch does not work for arrays indexed with variable 
(such a tab[a] || (tab[a+1]  8)) because fold_const does not fold (a + 1) - 
a. If such cases were folded, the number of cases detected would automatically 
be increased due to the use of fold_build2 to compare two offsets.

This patch also adds a few testcases to check both (i) that the optimization 
works as expected and (ii) that the result are correct. It also define new 
effective targets (bswap16, bswap32 and bswap64) to centralize the information 
about what target supports  byte swap instructions for the testsuite and modify 
existing tests to use these new effective targets.

The patch is quite big but could be split if necessary. A big part of the code 
added is for handling memory source and it would be difficult to split it but 
variable renaming and introduction of bwapXX effective target could be made 
separately to reduce the noise. The patch is too big so is only in attachment 
of this email.

The ChangeLog are as follows:

*** gcc/ChangeLog ***

2014-03-19  Thomas Preud'homme  thomas.preudho...@arm.com

PR tree-optimization/54733
* tree-ssa-math-opts.c (find_bswap_1): Renamed to ...
(find_bswap_or_nop_1): This. Also add support for memory source.
(find_bswap): Renamed to ...
(find_bswap_or_nop): This. Also add support for memory source and
detection of noop bitwise operations.
(execute_optimize_bswap): Likewise.

*** gcc/testsuite/ChangeLog ***

2014-03-19  Thomas Preud'homme  thomas.preudho...@arm.com

PR tree-optimization/54733
* lib/target-supports.exp: New effective targets for architectures
capable of performing byte swap.
* gcc.dg/optimize-bswapdi-1.c: Convert to new bswap target.
* gcc.dg/optimize-bswapdi-2.c: Likewise.
* gcc.dg/optimize-bswapsi-1.c: Likewise.
* gcc.dg/optimize-bswapdi-3.c: New test to check extension of bswap
optimization to support memory sources.
* gcc.dg/optimize-bswaphi-1.c: Likewise.
* gcc.dg/optimize-bswapsi-2.c: Likewise.
* gcc.c-torture/execute/bswap-2.c: Likewise.

Is this ok for stage 1?

Best regards,

Thomas

gcc32rm-84.3.diff
Description: Binary data

[committed] Fix lto build if WCONTINUED is not defined (PR lto/60571)

2014-03-19 Thread Jakub Jelinek

Hi!

WCONTINUED is (recent) Linux specific, so it doesn't have to be defined
on other hosts, or could be missing even on older Linux distros (e.g. glibc
2.3.2 doesn't have it).

Fixed thusly, committed as obvious.

2014-03-19  Jakub Jelinek  ja...@redhat.com

PR lto/60571
* lto.c (wait_for_child): Define WCONTINUED if not defined to 0.
Fix formatting.

--- gcc/lto/lto.c.jj2014-03-03 08:24:32.0 +0100
+++ gcc/lto/lto.c   2014-03-19 08:12:39.235144361 +0100
@@ -2476,7 +2476,10 @@ wait_for_child ()
   int status;
   do
 {
-  int w = waitpid(0, status, WUNTRACED | WCONTINUED);
+#ifndef WCONTINUED
+#define WCONTINUED 0
+#endif
+  int w = waitpid (0, status, WUNTRACED | WCONTINUED);
   if (w == -1)
fatal_error (waitpid failed);
 
@@ -2485,7 +2488,7 @@ wait_for_child ()
   else if (WIFSIGNALED (status))
fatal_error (streaming subprocess was killed by signal);
 }
-  while (!WIFEXITED(status)  !WIFSIGNALED(status));
+  while (!WIFEXITED (status)  !WIFSIGNALED (status));
 }
 #endif
 

Jakub

[PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Zhenqiang Chen

Hi,

ICE when compiling gcc.target/arm/neon-modes-3.c with -g in
arm_dwarf_register_span since parts[8] is out of bound for XImode.
GET_MODE_SIZE (XImode) / 4 is 16. rtx parts[8] can not hold all the
registers.

According to arm-modes.def, 16 should be the biggest number. So the
patch updates parts to

rtx parts[16];

Bootstrap and no make check regression on ARM Chrome book.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-03-19  Zhenqiang Chen  zhenqiang.c...@linaro.org

* config/arm/arm.c (arm_dwarf_register_span): Update the element number
of parts.

testsuite/ChangeLog:
2014-03-19  Zhenqiang Chen  zhenqiang.c...@linaro.org

* gcc.target/arm/neon-modes-3.c: Add -g option.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a68ed8d..c4466c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28692,7 +28692,7 @@ arm_dwarf_register_span (rtx rtl)
 {
   enum machine_mode mode;
   unsigned regno;
-  rtx parts[8];
+  rtx parts[16];
   int nregs;
   int i;

diff --git a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
index fe81875..f3e4f33 100644
--- a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
+++ b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon_ok } */
-/* { dg-options -O } */
+/* { dg-options -O -g } */
 /* { dg-add-options arm_neon } */

 #include arm_neon.h

[PATCH] Fix PR59543

2014-03-19 Thread Richard Biener


This fixes PR59543 (confirmed by Jakub for the testcase at least)
by not dropping debug stmts during WPA phase.

LTO profiled-bootstrapped on x86_64-unknown-linux-gnu, applied.

Honza - you can always come up with a better fix for 4.10.

Richard.

2014-03-19  Richard Biener  rguent...@suse.de

PR lto/59543
* lto-streamer-in.c (input_function): In WPA stage do not drop
debug stmts.

Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 208642)
+++ lto-streamer-in.c   (working copy)
@@ -988,7 +988,7 @@ input_function (tree fn_decl, struct dat
 We can't remove them earlier because this would cause uid
 mismatches in fixups, but we can do it at this point, as
 long as debug stmts don't require fixups.  */
- if (!MAY_HAVE_DEBUG_STMTS  is_gimple_debug (stmt))
+ if (!MAY_HAVE_DEBUG_STMTS  !flag_wpa  is_gimple_debug (stmt))
{
  gimple_stmt_iterator gsi = bsi;
  gsi_next (bsi);

Re: [PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Richard Biener

On Wed, 19 Mar 2014, Ramana Radhakrishnan wrote:

 On 03/19/14 08:42, Zhenqiang Chen wrote:
  Hi,
  
  ICE when compiling gcc.target/arm/neon-modes-3.c with -g in
  arm_dwarf_register_span since parts[8] is out of bound for XImode.
  GET_MODE_SIZE (XImode) / 4 is 16. rtx parts[8] can not hold all the
  registers.
  
  According to arm-modes.def, 16 should be the biggest number. So the
  patch updates parts to
  
  rtx parts[16];
  
  Bootstrap and no make check regression on ARM Chrome book.
  
  OK for trunk?
  
 
 It may be time in 4.10 or 5.0 (whatever we call it :)), to deal with the FIXME
 in arm_dwarf_register_span to deal with DW_OP_piece. I'm surprised that it's
 taken so long to hit this.
 
 This is OK for stage4 - it looks sane to me but this needs an RM ack before
 applying.

Ok (it can't possibly break anything).

Richard.

 regards
 Ramana
 
  Thanks!
  -Zhenqiang
  
  ChangeLog:
  2014-03-19  Zhenqiang Chen  zhenqiang.c...@linaro.org
  
   * config/arm/arm.c (arm_dwarf_register_span): Update the element number
   of parts.
  
  testsuite/ChangeLog:
  2014-03-19  Zhenqiang Chen  zhenqiang.c...@linaro.org
  
   * gcc.target/arm/neon-modes-3.c: Add -g option.
  
  diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
  index a68ed8d..c4466c1 100644
  --- a/gcc/config/arm/arm.c
  +++ b/gcc/config/arm/arm.c
  @@ -28692,7 +28692,7 @@ arm_dwarf_register_span (rtx rtl)
{
  enum machine_mode mode;
  unsigned regno;
  -  rtx parts[8];
  +  rtx parts[16];
  int nregs;
  int i;
  
  diff --git a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
  b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
  index fe81875..f3e4f33 100644
  --- a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
  +++ b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
  @@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_neon_ok } */
  -/* { dg-options -O } */
  +/* { dg-options -O -g } */
/* { dg-add-options arm_neon } */
  
#include arm_neon.h
  
 
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

[ARM/AArch64][0/3] Handle bitwise/bytewise reverse operations more effectively

2014-03-19 Thread Kyrill Tkachov


Hi all,

This patch series attempts to improve code generation on arm and aarch64 for 
various bitwise operations that can be expressed with rev16 instructions in 
those architectures. In particular expressions of the form:

((x  0x00ff00ff)  8) | ((x  0xff00ff00)  8)

This can appear in places like the Linux kernel and can be directly mapped to a 
single rev16 instruction.

This series has 3 parts:

[1/3] Add a new field to the rtx costs tables to represent the latency of the 
rev* group of instructions that will be used to accurately model the cost of 
these operations. Use it to properly cost existing patterns that generate rev16 
(for bswap operations).


[2/3] Add aarch64 combine patterns to recognise the above bitwise operations and 
map them to rev16. Model the cost appropriately and add helper functions that 
can be reused by the arm backend.


[3/3] Define similar combine patterns for arm and reuse the helper functions 
introduced in patch 2/3 to properly cost them.


I'm proposing these for next stage-1 of course.

Thanks,
Kyrill

[PATCH][ARM][1/3] Add rev field to rtx cost tables

2014-03-19 Thread Kyrill Tkachov


Hi all,

In order to properly cost the rev16 instruction we need a new field in the cost 
tables.

This patch adds that and specifies its value for the existing cost tables.
Since rev16 is used to implement the BSWAP operation we add handling of that in 
the rtx cost function using the new field.


Tested on arm-none-eabi and bootstrapped on an arm linux target.

Does it look ok for stage1?

Thanks,
Kyrill

2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/aarch-common-protos.h (alu_cost_table): Add rev field.
* config/arm/aarch-cost-tables.h (generic_extra_costs): Specify
rev cost.
(cortex_a53_extra_costs): Likewise.
(cortex_a57_extra_costs): Likewise.
* config/arm/arm.c (cortexa9_extra_costs): Likewise.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.
(arm_new_rtx_costs): Handle BSWAP.commit 13b2976a9448565beabc41055fdcbd209cde949f
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Feb 26 15:55:13 2014 +

Add rev field to rtx costs.

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 2b33626..4ff18cd 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -54,6 +54,7 @@ struct alu_cost_table
   const int bfi;		/* Bit-field insert.  */
   const int bfx;		/* Bit-field extraction.  */
   const int clz;		/* Count Leading Zeros.  */
+  const int rev;		/* Reverse bits/bytes.  */
   const int non_exec;		/* Extra cost when not executing insn.  */
   const bool non_exec_costs_exec; /* True if non-execution must add the exec
  cost.  */
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index c30ea2f..adf8708 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -39,6 +39,7 @@ const struct cpu_cost_table generic_extra_costs =
 0,			/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 COSTS_N_INSNS (1),	/* non_exec.  */
 false		/* non_exec_costs_exec.  */
   },
@@ -139,6 +140,7 @@ const struct cpu_cost_table cortexa53_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -239,6 +241,7 @@ const struct cpu_cost_table cortexa57_extra_costs =
 COSTS_N_INSNS (1), /* bfi.  */
 0, /* bfx.  */
 0, /* clz.  */
+0,			/* rev.  */
 0, /* non_exec.  */
 true   /* non_exec_costs_exec.  */
   },
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e69911c..a72ee1e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -982,6 +982,7 @@ const struct cpu_cost_table cortexa9_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1083,6 +1084,7 @@ const struct cpu_cost_table cortexa7_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 COSTS_N_INSNS (1),	/* clz.  */
+COSTS_N_INSNS (1),	/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1184,6 +1186,7 @@ const struct cpu_cost_table cortexa12_extra_costs =
 0,			/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 COSTS_N_INSNS (1),	/* clz.  */
+COSTS_N_INSNS (1),	/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1284,6 +1287,7 @@ const struct cpu_cost_table cortexa15_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1384,6 +1388,7 @@ const struct cpu_cost_table v7m_extra_costs =
 0,			/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 COSTS_N_INSNS (1),	/* non_exec.  */
 false		/* non_exec_costs_exec.  */
   },
@@ -9334,6 +9339,47 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   *cost = LIBCALL_COST (2);
   return false;
 
+case BSWAP:
+  if (arm_arch6)
+{
+  if (mode == SImode)
+{
+  *cost = COSTS_N_INSNS (1);
+  if (speed_p)
+*cost += extra_cost-alu.rev;
+
+  return false;
+}
+}
+  else
+{
+/* No rev instruction available.  Look at arm_legacy_rev
+   and thumb_legacy_rev for the form of RTL used then.  */
+  if (TARGET_THUMB)
+{
+  *cost = COSTS_N_INSNS (10);
+
+  if (speed_p)
+{
+  *cost += 6 * extra_cost-alu.shift;
+  *cost += 3 * extra_cost-alu.logical;
+

[PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data

2014-03-19 Thread Kyrill Tkachov


Hi all,

This patch adds a recogniser for the bitmask,shift,orr sequence of instructions 
that can be used to reverse the bytes in 16-bit halfwords (for the sequence 
itself look at the testcase included in the patch). This can be implemented with 
a rev16 instruction.
Since the shifts can occur in any order and there are no canonicalisation rules 
for where they appear in the expression we have to have two patterns to match 
both cases.


The rtx costs function is updated to recognise the pattern and cost it 
appropriately by using the rev field of the cost tables introduced in patch 
[1/3]. The rtx costs helper functions that are used to recognise those bitwise 
operations are placed in config/arm/aarch-common.c so that they can be reused by 
both arm and aarch64.


I've added an execute testcase but no scan-assembler tests since conceptually in 
the future the combiner might decide to not use a rev instruction due to rtx 
costs. We can at least test that the code generated is functionally correct though.


Tested aarch64-none-elf.

Ok for stage1?

[gcc/]
2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.md (rev16mode2): New pattern.
(rev16mode2_alt): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case.
* config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New.
(aarch_rev16_shleft_mask_imm_p): Likewise.
(aarch_rev16_p_1): Likewise.
(aarch_rev16_p): Likewise.
* config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern.
(aarch_rev16_shright_mask_imm_p): Likewise.
(aarch_rev16_shleft_mask_imm_p): Likewise.

[gcc/testsuite/]
2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/rev16_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ebd58c0..41761ae 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4682,6 +4682,16 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
   return false;
 
 case IOR:
+  if (aarch_rev16_p (x))
+{
+  *cost = COSTS_N_INSNS (1);
+
+  if (speed)
+*cost += extra_cost-alu.rev;
+
+  return true;
+}
+/* Fall through.  */
 case XOR:
 case AND:
 cost_logic:
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8..a23452b 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3173,6 +3173,38 @@
   [(set_attr type rev)]
 )
 
+;; There are no canonicalisation rules for the position of the lshiftrt, ashift
+;; operations within an IOR/AND RTX, therefore we have two patterns matching
+;; each valid permutation.
+
+(define_insn rev16mode2
+  [(set (match_operand:GPI 0 register_operand =r)
+(ior:GPI (and:GPI (ashift:GPI (match_operand:GPI 1 register_operand r)
+  (const_int 8))
+  (match_operand:GPI 3 const_int_operand n))
+ (and:GPI (lshiftrt:GPI (match_dup 1)
+(const_int 8))
+  (match_operand:GPI 2 const_int_operand n]
+  aarch_rev16_shleft_mask_imm_p (operands[3], MODEmode)
+aarch_rev16_shright_mask_imm_p (operands[2], MODEmode)
+  rev16\\t%w0, %w1
+  [(set_attr type rev)]
+)
+
+(define_insn rev16mode2_alt
+  [(set (match_operand:GPI 0 register_operand =r)
+(ior:GPI (and:GPI (lshiftrt:GPI (match_operand:GPI 1 register_operand r)
+(const_int 8))
+  (match_operand:GPI 2 const_int_operand n))
+ (and:GPI (ashift:GPI (match_dup 1)
+  (const_int 8))
+  (match_operand:GPI 3 const_int_operand n]
+  aarch_rev16_shleft_mask_imm_p (operands[3], MODEmode)
+aarch_rev16_shright_mask_imm_p (operands[2], MODEmode)
+  rev16\\t%w0, %w1
+  [(set_attr type rev)]
+)
+
 ;; zero_extend version of above
 (define_insn *bswapsi2_uxtw
   [(set (match_operand:DI 0 register_operand =r)
diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index d97ee61..08c4c7a 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -23,6 +23,9 @@
 #ifndef GCC_AARCH_COMMON_PROTOS_H
 #define GCC_AARCH_COMMON_PROTOS_H
 
+extern bool aarch_rev16_p (rtx);
+extern bool aarch_rev16_shleft_mask_imm_p (rtx, enum machine_mode);
+extern bool aarch_rev16_shright_mask_imm_p (rtx, enum machine_mode);
 extern int arm_early_load_addr_dep (rtx, rtx);
 extern int arm_early_store_addr_dep (rtx, rtx);
 extern int arm_mac_accumulator_is_mul_result (rtx, rtx);
diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index c11f7e9..75ed3fd 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -155,6 +155,79 @@ arm_get_set_operands (rtx producer, rtx consumer,
   return 0;
 }
 
+bool

[PATCH][ARM][3/3] Recognise bitwise operations leading to SImode rev16

2014-03-19 Thread Kyrill Tkachov


Hi all,

This is the arm equivalent of patch [2/3] in the series that adds combine 
patterns for the bitwise operations leading to a rev16 instruction.
It reuses the functions that were put in aarch-common.c to properly cost these 
operations.


I tried matching a DImode rev16 (with the intent of splitting it into two rev16 
ops) like aarch64 but combine wouldn't try to match that bitwise pattern in 
DImode like aarch64 does. Instead it tries various exotic combinations with subregs.


Tested arm-none-eabi, bootstrap on arm-none-linux-gnueabihf.

Ok for stage1?

[gcc/]
2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.md (arm_rev16si2): New pattern.
(arm_rev16si2_alt): Likewise.
* config/arm/arm.c (arm_new_rtx_costs): Handle rev16 case.


[gcc/testsuite/]
2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/arm/rev16.c: New test.commit 04e60723bd1fa2f8e2adcfeed676390643ffec0c
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Tue Feb 25 15:26:52 2014 +

[ARM] Implement SImode rev16

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8d1d721..ed603f0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9716,8 +9716,17 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   /* Vector mode?  */
   *cost = LIBCALL_COST (2);
   return false;
+case IOR:
+  if (mode == SImode  arm_arch6  aarch_rev16_p (x))
+{
+  *cost = COSTS_N_INSNS (1);
+  if (speed_p)
+*cost += extra_cost-alu.rev;
 
-case AND: case XOR: case IOR:
+  return true;
+}
+/* Fall through.  */
+case AND: case XOR:
   if (mode == SImode)
 	{
 	  enum rtx_code subcode = GET_CODE (XEXP (x, 0));
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4df24a2..47bc747 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12668,6 +12668,44 @@
(set_attr type rev)]
 )
 
+;; There are no canonicalisation rules for the position of the lshiftrt, ashift
+;; operations within an IOR/AND RTX, therefore we have two patterns matching
+;; each valid permutation.
+
+(define_insn arm_rev16si2
+  [(set (match_operand:SI 0 register_operand =l,l,r)
+(ior:SI (and:SI (ashift:SI (match_operand:SI 1 register_operand l,l,r)
+   (const_int 8))
+(match_operand:SI 3 const_int_operand n,n,n))
+(and:SI (lshiftrt:SI (match_dup 1)
+ (const_int 8))
+(match_operand:SI 2 const_int_operand n,n,n]
+  arm_arch6
+aarch_rev16_shleft_mask_imm_p (operands[3], SImode)
+aarch_rev16_shright_mask_imm_p (operands[2], SImode)
+  rev16\\t%0, %1
+  [(set_attr arch t1,t2,32)
+   (set_attr length 2,2,4)
+   (set_attr type rev)]
+)
+
+(define_insn arm_rev16si2_alt
+  [(set (match_operand:SI 0 register_operand =l,l,r)
+(ior:SI (and:SI (lshiftrt:SI (match_operand:SI 1 register_operand l,l,r)
+ (const_int 8))
+(match_operand:SI 2 const_int_operand n,n,n))
+(and:SI (ashift:SI (match_dup 1)
+   (const_int 8))
+(match_operand:SI 3 const_int_operand n,n,n]
+  arm_arch6
+aarch_rev16_shleft_mask_imm_p (operands[3], SImode)
+aarch_rev16_shright_mask_imm_p (operands[2], SImode)
+  rev16\\t%0, %1
+  [(set_attr arch t1,t2,32)
+   (set_attr length 2,2,4)
+   (set_attr type rev)]
+)
+
 (define_expand bswaphi2
   [(set (match_operand:HI 0 s_register_operand =r)
 	(bswap:HI (match_operand:HI 1 s_register_operand r)))]
diff --git a/gcc/testsuite/gcc.target/arm/rev16.c b/gcc/testsuite/gcc.target/arm/rev16.c
new file mode 100644
index 000..1c869b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/rev16.c
@@ -0,0 +1,35 @@
+/* { dg-options -O2 } */
+/* { dg-do run } */
+
+extern void abort (void);
+
+typedef unsigned int __u32;
+
+__u32
+__rev16_32_alt (__u32 x)
+{
+  return (((__u32)(x)  (__u32)0xff00ff00UL)  8)
+ | (((__u32)(x)  (__u32)0x00ff00ffUL)  8);
+}
+
+__u32
+__rev16_32 (__u32 x)
+{
+  return (((__u32)(x)  (__u32)0x00ff00ffUL)  8)
+ | (((__u32)(x)  (__u32)0xff00ff00UL)  8);
+}
+
+int
+main (void)
+{
+  volatile __u32 in32 = 0x12345678;
+  volatile __u32 expected32 = 0x34127856;
+
+  if (__rev16_32 (in32) != expected32)
+abort ();
+
+  if (__rev16_32_alt (in32) != expected32)
+abort ();
+
+  return 0;
+}

[PATCH][AArch64] Add handling of bswap operations in rtx costs

2014-03-19 Thread Kyrill Tkachov


Hi all,

This patch depends on the series started at 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00933.html but is not really a part 
of it. It just adds costing of the bswap operation using the new rev field in 
the rtx cost tables since we have patterns in aarch64.md that handle bswap by 
generating rev16 instructions.


Tested aarch64-none-elf.

Ok for stage1 after that series goes in?

2014-03-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle BSWAP.commit b9771a71dbf62522d423e16ce03353624c1ccd5a
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Feb 27 11:55:27 2014 +

[AArch64] Cost bswap operations properly

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 901ad3d..28c8841 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4678,6 +4678,14 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 
   return false;
 
+case BSWAP:
+  *cost = COSTS_N_INSNS (1);
+
+  if (speed)
+*cost += extra_cost-alu.rev;
+
+  return false;
+
 case IOR:
 case XOR:
 case AND:

stray warning from gcc's cpp

2014-03-19 Thread Andriy Gapon


I observe the following minor annoyance on FreeBSD systems where cpp is GCC's
cpp.  If a DTrace script has the following shebang line:
#!/usr/sbin/dtrace -Cs
then the following warning is produced when the script is run:
cc1: warning:  is shorter than expected

Some details.  dtrace(1) first forks. Then a child seeks on a file descriptor
associated with the script file, so that the shebang line is skipped (because
otherwise it would confuse cpp).  Then the child makes the file descriptor its
standard input and then it execs cpp.  cpp performs fstat(2) on its standard
input descriptor and determines that it points to a regular file.  Then it
verifies that a number of bytes it reads from the file is the same as a size of
the file.  The check makes sense if the file is opened by cpp itself, but it
does not always make sense for the stdin as described above.

The following patch seems to fix the issue, but perhaps there is a better /
smarter alternative.

--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -601,7 +601,8 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file)
   return false;
 }

-  if (regular  total != size  STAT_SIZE_RELIABLE (file-st))
+  if (regular  total != size  file-fd != 0
+   STAT_SIZE_RELIABLE (file-st))
 cpp_error (pfile, CPP_DL_WARNING,
   %s is shorter than expected, file-path);


-- 
Andriy Gapon

Re: [AArch64] 64-bit float vreinterpret implemention

2014-03-19 Thread Marcus Shawcroft

On 28 February 2014 10:30, Alex Velenko alex.vele...@arm.com wrote:

 Hi Richard,
 Thank you for your suggestion. Attached is a patch that includes
 implementation of your proposition. A testsuite was run on LE and BE
 compilers with no regressions.

 Here is the description of the patch:

 This patch introduces vreinterpret implementation for vectors with 64-bit
 float lanes and adds testcase for those intrinsics.

The aarch64_init_simd_builtins() infrastructure requires the presence
of named RTL patterns in order to construct the types of the SIMD
intrinsics even when an intrinsic is emitted as tree. This seems
rather ugly to me.  At some point we should figure out how to clean up
this aspect of aarch64_init_simd_builtins() and remove the otherwise
unused .md patterns.  This aside I think  your patch is fine as it
stands and can be committed in stage-1.

Cheers
/Marcus

Re: [PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Jakub Jelinek

On Wed, Mar 19, 2014 at 09:46:56AM +, Ramana Radhakrishnan wrote:
 On 03/19/14 08:42, Zhenqiang Chen wrote:
 ICE when compiling gcc.target/arm/neon-modes-3.c with -g in
 arm_dwarf_register_span since parts[8] is out of bound for XImode.
 GET_MODE_SIZE (XImode) / 4 is 16. rtx parts[8] can not hold all the
 registers.
 
 According to arm-modes.def, 16 should be the biggest number. So the
 patch updates parts to
 
 rtx parts[16];
 
 Bootstrap and no make check regression on ARM Chrome book.
 
 OK for trunk?
 
 
 It may be time in 4.10 or 5.0 (whatever we call it :)), to deal with
 the FIXME in arm_dwarf_register_span to deal with DW_OP_piece. I'm
 surprised that it's taken so long to hit this.
 
 This is OK for stage4 - it looks sane to me but this needs an RM ack
 before applying.

Ok.

Jakub

[PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener


This patch avoids calling ggc_collect after we possibly forked
during WPA phase as that necessarily causes a lot of page
unsharing.  I have verified that during a LTO bootstrap we
do not allocate GC memory during (or after) lto_wpa_write_files,
thus the effect on memory use should be positive (the patch
below contains checking code making sure that we don't alloc).

LTO bootstrapped on x86_64-unknown-linux-gnu, will apply shortly
(without the checking code of course).

That should fix the WPA memory explosion Martin sees with building
Chromium.

Richard.

2014-03-19  Richard Biener  rguent...@suse.de

* lto.c (lto_wpa_write_files): Move call to
lto_promote_cross_file_statics ...
(do_whole_program_analysis): ... here, into the partitioning
block.  Do not ggc_collect after lto_wpa_write_files but
for a last time before it.

Index: gcc/ggc-page.c
===
--- gcc/ggc-page.c  (revision 208642)
+++ gcc/ggc-page.c  (working copy)
@@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
   return size;
 }
 
+int may_alloc = 1;
+
 /* Allocate a chunk of memory of SIZE bytes.  Its contents are undefined.  */
 
 void *
@@ -1208,6 +1210,9 @@ ggc_internal_alloc_stat (size_t size MEM
   struct page_entry *entry;
   void *result;
 
+  if (!may_alloc)
+fatal_error (allocating GC memory);
+
   ggc_round_alloc_size_1 (size, order, object_size);
 
   /* If there are non-full pages for this size allocation, they are at
Index: gcc/lto/lto.c
===
--- gcc/lto/lto.c   (revision 208642)
+++ gcc/lto/lto.c   (working copy)
@@ -2565,11 +2566,6 @@ lto_wpa_write_files (void)
   FOR_EACH_VEC_ELT (ltrans_partitions, i, part)
 lto_stats.num_output_symtab_nodes += lto_symtab_encoder_size 
(part-encoder);
 
-  /* Find out statics that need to be promoted
- to globals with hidden visibility because they are accessed from multiple
- partitions.  */
-  lto_promote_cross_file_statics ();
-
   timevar_pop (TV_WHOPR_WPA);
 
   timevar_push (TV_WHOPR_WPA_IO);
@@ -3281,11 +3277,25 @@ do_whole_program_analysis (void)
 node-aux = NULL;
 
   lto_stats.num_cgraph_partitions += ltrans_partitions.length ();
+
+  /* Find out statics that need to be promoted
+ to globals with hidden visibility because they are accessed from multiple
+ partitions.  */
+  lto_promote_cross_file_statics ();
   timevar_pop (TV_WHOPR_PARTITIONING);
 
   timevar_stop (TV_PHASE_OPT_GEN);
-  timevar_start (TV_PHASE_STREAM_OUT);
 
+  /* Collect a last time - in lto_wpa_write_files we may end up forking
+ with the idea that this doesn't increase memory usage.  So we
+ absoultely do not want to collect after that.  */
+  ggc_collect ();
+{
+  extern int may_alloc;
+  may_alloc = 0;
+}
+
+  timevar_start (TV_PHASE_STREAM_OUT);
   if (!quiet_flag)
 {
   fprintf (stderr, \nStreaming out);
@@ -3294,10 +3304,8 @@ do_whole_program_analysis (void)
   lto_wpa_write_files ();
   if (!quiet_flag)
 fprintf (stderr, \n);
-
   timevar_stop (TV_PHASE_STREAM_OUT);
 
-  ggc_collect ();
   if (post_ipa_mem_report)
 {
   fprintf (stderr, Memory consumption after IPA\n);

[PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Marek Polacek

Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
so check that it exists before accessing it.
Note that the test has to be run; only compiling wasn't enough
to provoke the ICE.

Ran ubsan testsuite on x86_64-linux, ok for trunk?

2014-03-19  Marek Polacek  pola...@redhat.com

PR sanitizer/60569
* ubsan.c (ubsan_type_descriptor): Check that DECL_NAME is nonnull
before accessing it.
testsuite/
* g++.dg/ubsan/pr60569.C: New test.

diff --git gcc/testsuite/g++.dg/ubsan/pr60569.C 
gcc/testsuite/g++.dg/ubsan/pr60569.C
index e69de29..df6b7a4 100644
--- gcc/testsuite/g++.dg/ubsan/pr60569.C
+++ gcc/testsuite/g++.dg/ubsan/pr60569.C
@@ -0,0 +1,21 @@
+// PR sanitizer/60569
+// { dg-do run }
+// { dg-require-effective-target lto }
+// { dg-options -fsanitize=undefined -flto }
+
+struct A
+{
+  void foo ();
+  struct
+  {
+int i;
+void bar () { i = 0; }
+  } s;
+};
+
+void A::foo () { s.bar (); }
+
+int
+main ()
+{
+}
diff --git gcc/ubsan.c gcc/ubsan.c
index 7c7a893..22470da 100644
--- gcc/ubsan.c
+++ gcc/ubsan.c
@@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool want_pointer_type_p)
 {
   if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
-  else
+  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
 }
 

Marek

Re: [PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Jakub Jelinek

On Wed, Mar 19, 2014 at 12:13:57PM +0100, Marek Polacek wrote:
 Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
 so check that it exists before accessing it.
 Note that the test has to be run; only compiling wasn't enough
 to provoke the ICE.

??  Shouldn't // { dg-do link } be sufficient?

 --- gcc/ubsan.c
 +++ gcc/ubsan.c
 @@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool 
 want_pointer_type_p)
  {
if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
   tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
 -  else
 +  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
   tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
  }

This looks good to me.

Jakub

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Steven Bosscher

On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
 Index: gcc/ggc-page.c
 ===
 --- gcc/ggc-page.c  (revision 208642)
 +++ gcc/ggc-page.c  (working copy)
 @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
return size;
  }

 +int may_alloc = 1;

bool may_alloc?

Ciao!
Steven

Re: [testsuite] Fix gcc.dg/tls/pr58595.c on Solaris 9

2014-03-19 Thread Rainer Orth

Jakub Jelinek ja...@redhat.com writes:

 On Tue, Mar 18, 2014 at 11:19:52AM +0100, Rainer Orth wrote:
 The new gcc.dg/tls/pr58595.c testcase FAILs on Solaris 9:
 
 FAIL: gcc.dg/tls/pr58595.c (test for excess errors)
 Excess errors:
 Undefined   first referenced
  symbol in file
 ___tls_get_addr /var/tmp//ccuBbAna.o
 ld: fatal: Symbol referencing errors. No output written to ./pr58595.exe
 WARNING: gcc.dg/tls/pr58595.c compilation failed to produce executable
 
 Fixed as follows, tested with the appropriate runtest invocation on
 i386-pc-solaris2.9, i386-pc-solaris2.11, and x86_64-unknown-linux-gnu,
 installed on mainline.

 Can you please also change
 /* { dg-require-effective-target tls } */
 to
 /* { dg-require-effective-target tls_runtime } */
 ?

Sure, done as follows after retesting as before:

2014-03-19  Rainer Orth  r...@cebitec.uni-bielefeld.de

* gcc.dg/tls/pr58595.c: Require tls_runtime instead of tls.

changeset:   13384:d1c2de35507e
tag: tip
user:Rainer Orth r...@cebitec.uni-bielefeld.de
date:Wed Mar 19 13:04:36 2014 +0100
summary: Require tls_runtime in gcc.dg/tls/pr58595.c

diff --git a/gcc/testsuite/gcc.dg/tls/pr58595.c b/gcc/testsuite/gcc.dg/tls/pr58595.c
--- a/gcc/testsuite/gcc.dg/tls/pr58595.c
+++ b/gcc/testsuite/gcc.dg/tls/pr58595.c
@@ -3,7 +3,7 @@
 /* { dg-options -O2 } */
 /* { dg-additional-options -fpic { target fpic } } */
 /* { dg-add-options tls } */
-/* { dg-require-effective-target tls } */
+/* { dg-require-effective-target tls_runtime } */
 /* { dg-require-effective-target sync_int_long } */
 
 struct S { unsigned long a, b; };


 BTW, don't know if dg-add-options tls can come before that or not.

It can: the tls_runtime check takes care of adding the options itself.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Marek Polacek

On Wed, Mar 19, 2014 at 12:17:19PM +0100, Jakub Jelinek wrote:
 On Wed, Mar 19, 2014 at 12:13:57PM +0100, Marek Polacek wrote:
  Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
  so check that it exists before accessing it.
  Note that the test has to be run; only compiling wasn't enough
  to provoke the ICE.
 
 ??  Shouldn't // { dg-do link } be sufficient?

Ah, forgot about that, it is sufficient.  Ok with dg-do link instead
of dg-do run?
 
  --- gcc/ubsan.c
  +++ gcc/ubsan.c
  @@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool 
  want_pointer_type_p)
   {
 if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
  tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
  -  else
  +  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
  tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
   }
 
 This looks good to me.

Thanks.

Marek

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener

On Wed, 19 Mar 2014, Steven Bosscher wrote:

 On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
  Index: gcc/ggc-page.c
  ===
  --- gcc/ggc-page.c  (revision 208642)
  +++ gcc/ggc-page.c  (working copy)
  @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
 return size;
   }
 
  +int may_alloc = 1;
 
 bool may_alloc?

It's only checking code I didn't commit.  We may of course alloc
but I wanted to prove we don't.

Richard.

[PATCH] Reduce GC walk recursion depth for types

2014-03-19 Thread Richard Biener


This reduces GC walk recursion depth in two ways.

First by re-ordering tree_type_common members to move 'name' last
and 'canonical' before 'next_variant'.  That makes us
first recurse downward (type, pointer_to/reference_to), then
on the same level (canonical, next_variant, main_variant)
and finally upward (context, name-decl_context).
For TS_TYPE_NON_COMMON we still walk down afterwards via values,
on the same level via minval/maxval and upwards via binfo, so
that the patch helps is maybe too  much handwaving?  (but it
helps a reduced testcase without doing the 2nd part)

Second by choosing sth different for chain_next for types
than TREE_CHAIN (which is TYPE_STUB_DECL, no chain at all).
That makes the unreduced testcase work and apart from the issue
below should be obvious enough (though there usually shouldn't
be so many type variants - still if for every type we save
two or three recursions that still helps).

Martin verified this fixes PR60553.

I've changed chain_next only for the LTO frontend as

  while (ggc_test_and_set_mark (xlimit))
   xlimit = (CODE_CONTAINS_STRUCT (TREE_CODE ((*xlimit).generic), 
TS_TYPE_COMMON) ? ((union lang_tree_node *) 
(*xlimit).generic.type_common.next_variant) : CODE_CONTAINS_STRUCT 
(TREE_CODE ((*xlimit).generic), TS_COMMON) ? ((union lang_tree_node *) 
(*xlimit).generic.common.chain) : NULL);

likely doesn't create great code ... (note duplicate tree checks
with checking here for other frontends, fixed LTO with the patch
below).

LTO bootstrap running on x86_64-unknown-linux-gnu.

Ok for trunk?

Thanks,
Richard.

2014-03-19  Richard Biener  rguent...@suse.de

PR middle-end/60553
* tree-core.h (tree_type_common): Re-order pointer members
to reduce recursion depth during GC walks.

lto/
* lto-tree.h (lang_tree_node): For types use TYPE_NEXT_VARIANT 
instead of TREE_CHAIN as chain_next.

Index: gcc/tree-core.h
===
--- gcc/tree-core.h (revision 208642)
+++ gcc/tree-core.h (working copy)
@@ -1265,11 +1265,11 @@ struct GTY(()) tree_type_common {
 const char * GTY ((tag (TYPE_SYMTAB_IS_POINTER))) pointer;
 struct die_struct * GTY ((tag (TYPE_SYMTAB_IS_DIE))) die;
   } GTY ((desc (debug_hooks-tree_type_symtab_field))) symtab;
-  tree name;
+  tree canonical;
   tree next_variant;
   tree main_variant;
   tree context;
-  tree canonical;
+  tree name;
 };
 
 struct GTY(()) tree_type_with_lang_specific {
Index: gcc/lto/lto-tree.h
===
--- gcc/lto/lto-tree.h  (revision 208642)
+++ gcc/lto/lto-tree.h  (working copy)
@@ -48,7 +48,7 @@ enum lto_tree_node_structure_enum {
 };
 
 union GTY((desc (lto_tree_node_structure (%h)),
- chain_next (CODE_CONTAINS_STRUCT (TREE_CODE (%h.generic), 
TS_COMMON) ? ((union lang_tree_node *) TREE_CHAIN (%h.generic)) : NULL)))
+ chain_next (CODE_CONTAINS_STRUCT (TREE_CODE (%h.generic), 
TS_TYPE_COMMON) ? ((union lang_tree_node *) 
%h.generic.type_common.next_variant) : CODE_CONTAINS_STRUCT (TREE_CODE 
(%h.generic), TS_COMMON) ? ((union lang_tree_node *) %h.generic.common.chain) 
: NULL)))
 lang_tree_node
 {
   union tree_node GTY ((tag (TS_LTO_GENERIC),

Re: [PATCH] Reduce GC walk recursion depth for types

2014-03-19 Thread Jakub Jelinek

On Wed, Mar 19, 2014 at 02:02:10PM +0100, Richard Biener wrote:
 LTO bootstrap running on x86_64-unknown-linux-gnu.
 
 Ok for trunk?
 
 Thanks,
 Richard.
 
 2014-03-19  Richard Biener  rguent...@suse.de
 
   PR middle-end/60553
   * tree-core.h (tree_type_common): Re-order pointer members
   to reduce recursion depth during GC walks.
 
   lto/
   * lto-tree.h (lang_tree_node): For types use TYPE_NEXT_VARIANT 
   instead of TREE_CHAIN as chain_next.

LGTM.

Jakub

[Fortran][PATCH][gomp4]: Transform OpenACC loop directive

2014-03-19 Thread Ilmir Usmanov


Hi Tobias!

This patch implements transformation of OpenACC loop directive from 
Fortran AST to GENERIC.


Successfully bootstrapped and tested with no new regressions on 
x86_64-unknown-linux-gnu.


OK for gomp4 branch?

--
Ilmir.
From de2dd5ba0c48500e8e9084bd46cbfac2f21352fe Mon Sep 17 00:00:00 2001
From: Ilmir Usmanov i.usma...@samsung.com
Date: Wed, 19 Mar 2014 15:12:36 +0400
Subject: [PATCH] Transform OpenACC loop directive from fortran AST to GENERIC

---
	* gcc/fortran/trans-openmp.c (gfc_trans_oacc_loop): New function.
	(gfc_trans_oacc_combined_directive): Call it.
	(gfc_trans_oacc_directive): Likewise.
	* gcc/tree-pretty-print (dump_omp_clause): Fix WORKER and VECTOR.
	* gcc/testsuite/gfortran.dg/goacc/loop-tree.f95: New test.

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 29364f4..cb7c970 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1571,11 +1571,181 @@ typedef struct dovar_init_d {
   tree init;
 } dovar_init;
 
+
+static tree
+gfc_trans_oacc_loop (gfc_code *code, stmtblock_t *pblock,
+		 gfc_omp_clauses *loop_clauses)
+{
+  gfc_se se;
+  tree dovar, stmt, from, to, step, type, init, cond, incr;
+  tree count = NULL_TREE, cycle_label, tmp, omp_clauses;
+  stmtblock_t block;
+  stmtblock_t body;
+  gfc_omp_clauses *clauses = code-ext.omp_clauses;
+  int i, collapse = clauses-collapse;
+  vecdovar_init inits = vNULL;
+  dovar_init *di;
+  unsigned ix;
+
+  if (collapse = 0)
+collapse = 1;
+
+  code = code-block-next;
+  gcc_assert (code-op == EXEC_DO || code-op == EXEC_DO_CONCURRENT);
+
+  init = make_tree_vec (collapse);
+  cond = make_tree_vec (collapse);
+  incr = make_tree_vec (collapse);
+
+  if (pblock == NULL)
+{
+  gfc_start_block (block);
+  pblock = block;
+}
+
+  omp_clauses = gfc_trans_omp_clauses (pblock, loop_clauses, code-loc);
+
+  for (i = 0; i  collapse; i++)
+{
+  int simple = 0;
+
+  /* Evaluate all the expressions in the iterator.  */
+  gfc_init_se (se, NULL);
+  gfc_conv_expr_lhs (se, code-ext.iterator-var);
+  gfc_add_block_to_block (pblock, se.pre);
+  dovar = se.expr;
+  type = TREE_TYPE (dovar);
+  gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
+
+  gfc_init_se (se, NULL);
+  gfc_conv_expr_val (se, code-ext.iterator-start);
+  gfc_add_block_to_block (pblock, se.pre);
+  from = gfc_evaluate_now (se.expr, pblock);
+
+  gfc_init_se (se, NULL);
+  gfc_conv_expr_val (se, code-ext.iterator-end);
+  gfc_add_block_to_block (pblock, se.pre);
+  to = gfc_evaluate_now (se.expr, pblock);
+
+  gfc_init_se (se, NULL);
+  gfc_conv_expr_val (se, code-ext.iterator-step);
+  gfc_add_block_to_block (pblock, se.pre);
+  step = gfc_evaluate_now (se.expr, pblock);
+
+  /* Special case simple loops.  */
+  if (TREE_CODE (dovar) == VAR_DECL)
+	{
+	  if (integer_onep (step))
+	simple = 1;
+	  else if (tree_int_cst_equal (step, integer_minus_one_node))
+	simple = -1;
+	}
+
+  /* Loop body.  */
+  if (simple)
+	{
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, dovar, from);
+	  /* The condition should not be folded.  */
+	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, simple  0
+	   ? LE_EXPR : GE_EXPR,
+	   boolean_type_node, dovar, to);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
+		type, dovar, step);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
+		MODIFY_EXPR,
+		type, dovar,
+		TREE_VEC_ELT (incr, i));
+	}
+  else
+	{
+	  /* STEP is not 1 or -1.  Use:
+	 for (count = 0; count  (to + step - from) / step; count++)
+	   {
+		 dovar = from + count * step;
+		 body;
+	   cycle_label:;
+	   }  */
+	  tmp = fold_build2_loc (input_location, MINUS_EXPR, type, step, from);
+	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, to, tmp);
+	  tmp = fold_build2_loc (input_location, TRUNC_DIV_EXPR, type, tmp,
+ step);
+	  tmp = gfc_evaluate_now (tmp, pblock);
+	  count = gfc_create_var (type, count);
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, count,
+	 build_int_cst (type, 0));
+	  /* The condition should not be folded.  */
+	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, LT_EXPR,
+	   boolean_type_node,
+	   count, tmp);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
+		type, count,
+		build_int_cst (type, 1));
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
+		MODIFY_EXPR, type, count,
+		TREE_VEC_ELT (incr, i));
+
+	  /* Initialize DOVAR.  */
+	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, count, step);
+	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, from, tmp);
+	  dovar_init e = {dovar, tmp};
+	  inits.safe_push (e);
+	}
+
+  if (i + 1  collapse)
+	code = code-block-next;
+}
+
+  if (pblock != block)
+{
+  pushlevel ();
+

Re: [C++ PATCH] [gomp4] Initial OpenACC support to C++ front-end

2014-03-19 Thread Ilmir Usmanov


Ping.

On 13.03.2014 21:05, Ilmir Usmanov wrote:

On 07.03.2014 15:37, Ilmir Usmanov wrote:

Hi Thomas!

I prepared simple patch to add support of OpenACC data, kernels and 
parallel constructs to C++ FE.


It adds support of data clauses too.

OK to gomp4 branch?


Fixed subject: changed file extensions of tests and fixed comments.

OK to gomp4 branch?


--
Ilmir.

Re: [patch] gcc fstack-protector-explicit

2014-03-19 Thread Marcos Díaz

Well, finally I have the assignment, could you please review this patch?

On Wed, Nov 20, 2013 at 4:13 PM, Jeff Law l...@redhat.com wrote:
 On 11/19/13 07:04, Marcos Díaz wrote:

 My employer is working on the signature of the papers. Could someone
 please do the review meanwhile?

 I'd prefer to wait until the assignment process is complete.  If something
 were to happen and we can't use your code the review time would have been
 wasted (and such things have certainly happened in the past).

 Once the assignment is recorded, please ping this patch.

 Jeff




-- 
__


Marcos Díaz

Software Engineer


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211/ +54 351 7617452

Skype: markdiaz22

Re: [patch] gcc fstack-protector-explicit

2014-03-19 Thread Jeff Law


On 03/19/14 08:06, Marcos Díaz wrote:

Well, finally I have the assignment, could you please review this patch?
Thanks.  I'll take a look once we open up stage1 development again 
(should be soon as 4.9 is getting close to being ready).


jeff

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener

On Wed, 19 Mar 2014, Martin Liška wrote:

 There are stats for Firefox with LTO and -O2. According to graphs it
 looks that memory consumption for parallel WPA phase is similar.
 When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
 footprint is similar to parallel WPA that reduces libxul.so linking by ~10%.

Ok, so I suppose this tracks RSS, not virtual memory use (what is
used and what is active)?

And it is WPA plus LTRANS stages, WPA ends where memory use first goes
down to zero?

I wonder if you can identify the point where parallel streaming
starts and where it ends ... ;)

Btw, I have another patch in my local tree, limiting the
exponential growth of blocks we allocate when outputting sections.
But it shouldn't be _that_ bad ... maybe you can try if it has
any effect?

Thanks,
Richard.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c   (revision 208642)
+++ gcc/lto-section-out.c   (working copy)
@@ -99,13 +99,19 @@ lto_end_section (void)
 }
 
 
+/* We exponentially grow the size of the blocks as we need to make
+   room for more data to be written.  Start with a single page and go up
+   to 2MB pages for this.  */
+#define FIRST_BLOCK_SIZE 4096
+#define MAX_BLOCK_SIZE (2 * 1024 * 1024)
+
 /* Write all of the chars in OBS to the assembler.  Recycle the blocks
in obs as this is being done.  */
 
 void
 lto_write_stream (struct lto_output_stream *obs)
 {
-  unsigned int block_size = 1024;
+  unsigned int block_size = FIRST_BLOCK_SIZE;
   struct lto_char_ptr_base *block;
   struct lto_char_ptr_base *next_block;
   if (!obs-first_block)
@@ -135,6 +141,7 @@ lto_write_stream (struct lto_output_stre
   else
lang_hooks.lto.append_data (base, num_chars, block);
   block_size *= 2;
+  block_size = MIN (MAX_BLOCK_SIZE, block_size);
 }
 }
 
@@ -152,7 +159,7 @@ lto_append_block (struct lto_output_stre
 {
   /* This is the first time the stream has been written
 into.  */
-  obs-block_size = 1024;
+  obs-block_size = FIRST_BLOCK_SIZE;
   new_block = (struct lto_char_ptr_base*) xmalloc (obs-block_size);
   obs-first_block = new_block;
 }
@@ -162,6 +169,7 @@ lto_append_block (struct lto_output_stre
   /* Get a new block that is twice as big as the last block
 and link it into the list.  */
   obs-block_size *= 2;
+  obs-block_size = MIN (MAX_BLOCK_SIZE, obs-block_size);
   new_block = (struct lto_char_ptr_base*) xmalloc (obs-block_size);
   /* The first bytes of the block are reserved as a pointer to
 the next block.  Set the chain of the full block to the

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Martin Liška



On 03/19/2014 03:55 PM, Richard Biener wrote:

On Wed, 19 Mar 2014, Martin Liška wrote:


There are stats for Firefox with LTO and -O2. According to graphs it
looks that memory consumption for parallel WPA phase is similar.
When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
footprint is similar to parallel WPA that reduces libxul.so linking by ~10%.

Ok, so I suppose this tracks RSS, not virtual memory use (what is
used and what is active)?


Data are given by vmstat, according to: 
http://stackoverflow.com/questions/18529723/what-is-active-memory-and-inactive-memory


*Active memory*is memory that is being used by a particular process.
*Inactive memory*is memory that was allocated to a process that is no 
longer running.


So please follow just 'blue' line that displays really used memory. 
According to man, vmstat tracks virtual memory statistics.



And it is WPA plus LTRANS stages, WPA ends where memory use first goes
down to zero?
I wonder if you can identify the point where parallel streaming
starts and where it ends ... ;)


Exactly, WPA ends when it goes to zero.


Btw, I have another patch in my local tree, limiting the
exponential growth of blocks we allocate when outputting sections.
But it shouldn't be _that_ bad ... maybe you can try if it has
any effect?


I can apply it.

Martin



Thanks,
Richard.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c   (revision 208642)
+++ gcc/lto-section-out.c   (working copy)
@@ -99,13 +99,19 @@ lto_end_section (void)
  }
  
  
+/* We exponentially grow the size of the blocks as we need to make

+   room for more data to be written.  Start with a single page and go up
+   to 2MB pages for this.  */
+#define FIRST_BLOCK_SIZE 4096
+#define MAX_BLOCK_SIZE (2 * 1024 * 1024)
+
  /* Write all of the chars in OBS to the assembler.  Recycle the blocks
 in obs as this is being done.  */
  
  void

  lto_write_stream (struct lto_output_stream *obs)
  {
-  unsigned int block_size = 1024;
+  unsigned int block_size = FIRST_BLOCK_SIZE;
struct lto_char_ptr_base *block;
struct lto_char_ptr_base *next_block;
if (!obs-first_block)
@@ -135,6 +141,7 @@ lto_write_stream (struct lto_output_stre
else
lang_hooks.lto.append_data (base, num_chars, block);
block_size *= 2;
+  block_size = MIN (MAX_BLOCK_SIZE, block_size);
  }
  }
  
@@ -152,7 +159,7 @@ lto_append_block (struct lto_output_stre

  {
/* This is the first time the stream has been written
 into.  */
-  obs-block_size = 1024;
+  obs-block_size = FIRST_BLOCK_SIZE;
new_block = (struct lto_char_ptr_base*) xmalloc (obs-block_size);
obs-first_block = new_block;
  }
@@ -162,6 +169,7 @@ lto_append_block (struct lto_output_stre
/* Get a new block that is twice as big as the last block
 and link it into the list.  */
obs-block_size *= 2;
+  obs-block_size = MIN (MAX_BLOCK_SIZE, obs-block_size);
new_block = (struct lto_char_ptr_base*) xmalloc (obs-block_size);
/* The first bytes of the block are reserved as a pointer to
 the next block.  Set the chain of the full block to the

Re: [ARM] [Trivial] Fix shortening of field name extend.

2014-03-19 Thread James Greenhalgh

On Mon, Feb 24, 2014 at 09:13:45AM +, James Greenhalgh wrote:
 *ping*, CCing Jakub.

*ping x2* This was OKed by ramana, but we wanted release manager approval.
I would have committed the patch as obvious if we were not in stage 4.

Thanks,
James

 On Wed, Feb 12, 2014 at 12:43:10PM +, Ramana Radhakrishnan wrote:
  On 02/12/14 12:19, James Greenhalgh wrote:
  
   Hi,
  
   In aarch-common-protos.h we define a field in alu_cost_table:
  
  extnd
  
   On its own this is an upsetting optimization of the
   English language, but this trouble is compounded by the
   comment attached to this field throughout the cost tables
   themselves:
  
  /* Extend.  */
  
   This patch fixes the spelling of extend to match that in the
   commemnts.
  
   I've checked that AArch64 and AArch32 build with this patch
   applied.
  
   OK for trunk/stage-1 (I don't mind which)?
  
  I am happy for this to go in now -
  
  Jakub ?
  
  
  regards
  Ramana
 

2014-03-19  James Greenhalgh  james.greenha...@arm.com

* config/arm/aarch-common-protos.h
(alu_cost_table): Fix spelling of extend.
* config/arm/arm.c (arm_new_rtx_costs): Fix spelling of extend.

diff --git a/gcc/config/arm/aarch-common-protos.h 
b/gcc/config/arm/aarch-common-protos.h
index 056fe56..a5ff6b4 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -48,8 +48,8 @@ struct alu_cost_table
   const int arith_shift_reg;   /* ... and when the shift is by a reg.  */
   const int log_shift; /* Additional when logic also shifts...  */
   const int log_shift_reg; /* ... and when the shift is by a reg.  */
-  const int extnd; /* Zero/sign extension.  */
-  const int extnd_arith;   /* Extend and arith.  */
+  const int extend;/* Zero/sign extension.  */
+  const int extend_arith;  /* Extend and arith.  */
   const int bfi;   /* Bit-field insert.  */
   const int bfx;   /* Bit-field extraction.  */
   const int clz;   /* Count Leading Zeros.  */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a68ed8d..31df089 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9594,7 +9594,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
{
  /* UXTA[BH] or SXTA[BH].  */
  if (speed_p)
-   *cost += extra_cost-alu.extnd_arith;
+   *cost += extra_cost-alu.extend_arith;
  *cost += (rtx_cost (XEXP (XEXP (x, 0), 0), ZERO_EXTEND, 0,
  speed_p)
+ rtx_cost (XEXP (x, 1), PLUS, 0, speed_p));
@@ -10311,7 +10311,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
  *cost = COSTS_N_INSNS (1);
  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
  if (speed_p)
-   *cost += extra_cost-alu.extnd;
+   *cost += extra_cost-alu.extend;
}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
{
@@ -10364,7 +10364,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
  *cost = COSTS_N_INSNS (1);
  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
  if (speed_p)
-   *cost += extra_cost-alu.extnd;
+   *cost += extra_cost-alu.extend;
}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
{

Re: [ARM] [Trivial] Fix shortening of field name extend.

2014-03-19 Thread Jakub Jelinek

On Wed, Mar 19, 2014 at 03:13:40PM +, James Greenhalgh wrote:
 On Mon, Feb 24, 2014 at 09:13:45AM +, James Greenhalgh wrote:
  *ping*, CCing Jakub.
 
 *ping x2* This was OKed by ramana, but we wanted release manager approval.
 I would have committed the patch as obvious if we were not in stage 4.

This is ok even in stage4.

Jakub

Re: [Patch AArch64] Define TARGET_FLAGS_REGNUM

2014-03-19 Thread Marcus Shawcroft

On 28 February 2014 09:32, Ramana Radhakrishnan ramra...@arm.com wrote:
 Hi,

 This defines TARGET_FLAGS_REGNUM for AArch64 to be CC_REGNUM.
 Noticed this turns on the cmpelim pass after reload and in a few examples
 and a couple of benchmarks I noticed a number of comparisons getting
 deleted. A similar patch for AArch32 is being tested.

 Tested cross with aarch64-none-elf on a model with no regressions.

 Ok for stage1 ?

OK /Marcus

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener

On Wed, 19 Mar 2014, Martin Liška wrote:

 
 On 03/19/2014 03:55 PM, Richard Biener wrote:
  On Wed, 19 Mar 2014, Martin Liška wrote:
  
   There are stats for Firefox with LTO and -O2. According to graphs it
   looks that memory consumption for parallel WPA phase is similar.
   When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
   footprint is similar to parallel WPA that reduces libxul.so linking by
   ~10%.
  Ok, so I suppose this tracks RSS, not virtual memory use (what is
  used and what is active)?
 
 Data are given by vmstat, according to:
 http://stackoverflow.com/questions/18529723/what-is-active-memory-and-inactive-memory
 
 *Active memory*is memory that is being used by a particular process.
 *Inactive memory*is memory that was allocated to a process that is no longer
 running.

 So please follow just 'blue' line that displays really used memory. According
 to man, vmstat tracks virtual memory statistics.

But 'blue' is neither active nor inactive ... what is 'used'?  Does
it correspond to 'swpd'?

If it is virtual memory in use then this is expected to grow when 
fork()ing as the virtual memory space is obviously copied (just the pages 
are still shared).

For me allocating a GB memory and clearing it increases active by
1GB and then forking doesn't increase any of the metrics vmstat -a
outputs in any significant way.

  And it is WPA plus LTRANS stages, WPA ends where memory use first goes
  down to zero?
  I wonder if you can identify the point where parallel streaming
  starts and where it ends ... ;)
 
 Exactly, WPA ends when it goes to zero.

So the difference isn't that big (8GB vs. 7.2GB), and is likely attributed
to heap memory we allocate during the stream-out.  For example
we need some for the tree-ref-encoders (I remember that can be a
significant amount of memory, but I improved that already as far as
possible...).  So yes, we _do_ allocate memory during stream-out
and that is now required N times.

  Btw, I have another patch in my local tree, limiting the
  exponential growth of blocks we allocate when outputting sections.
  But it shouldn't be _that_ bad ... maybe you can try if it has
  any effect?
 
 I can apply it.

Thanks,
Richard.

PATCH: PR testsuite/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread H.J. Lu

On Wed, Mar 19, 2014 at 8:41 AM, H.J. Lu hongjiu...@intel.com wrote:
 GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
 set_ld_library_path_env_vars sets a few environment variables including
 LD_RUN_PATH.  This patch logs all environment variables set by
 set_ld_library_path_env_vars so that one can recreate the same
 executable as make check run.  OK to install?

 Thanks.

 H.J.
 ---
 2014-03-19  H.J. Lu  hongjiu...@intel.com

 PR testsuite/60590
 * lib/target-libpath.exp (set_ld_library_path_env_vars): Log
 LD_LIBRARY_PATH, LD_RUN_PATH, SHLIB_PATH, LD_LIBRARY_PATH_32,
 LD_LIBRARY_PATH_64 and DYLD_LIBRARY_PATH.

 diff --git a/gcc/testsuite/lib/target-libpath.exp 
 b/gcc/testsuite/lib/target-libpath.exp
 index 603ed8a..1891088 100644
 --- a/gcc/testsuite/lib/target-libpath.exp
 +++ b/gcc/testsuite/lib/target-libpath.exp
 @@ -155,7 +155,12 @@ proc set_ld_library_path_env_vars { } {
  setenv DYLD_LIBRARY_PATH $ld_library_path
}

 -  verbose -log set_ld_library_path_env_vars: 
 ld_library_path=$ld_library_path
 +  verbose -log LD_LIBRARY_PATH=[getenv LD_LIBRARY_PATH]
 +  verbose -log LD_RUN_PATH=[getenv LD_RUN_PATH]
 +  verbose -log SHLIB_PATH=[getenv SHLIB_PATH]
 +  verbose -log LD_LIBRARY_PATH_32=[getenv LD_LIBRARY_PATH_32]
 +  verbose -log LD_LIBRARY_PATH_64=[getenv LD_LIBRARY_PATH_64]
 +  verbose -log DYLD_LIBRARY_PATH=[getenv DYLD_LIBRARY_PATH]
  }

  ###

Correction.  It is a testsuite issue.

-- 
H.J.

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump

On Mar 18, 2014, at 6:16 AM, Kai Tietz ktiet...@googlemail.com wrote:
 this patch skips anon2.C and anon3.C test for mingw target.  Issue
 here is that weak under pe-coff is different to ELF-targets and
 therefore test doesn't apply for

So, what does the output look like?  There should be a trace of weak of some 
sort in the output.

Re: [C++ Patch / RFC] PR 51474

2014-03-19 Thread Jason Merrill


OK.

Jason

PATCH: PR target/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread H.J. Lu

GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
set_ld_library_path_env_vars sets a few environment variables including
LD_RUN_PATH.  This patch logs all environment variables set by
set_ld_library_path_env_vars so that one can recreate the same
executable as make check run.  OK to install?

Thanks.

H.J.
---
2014-03-19  H.J. Lu  hongjiu...@intel.com

PR target/60590
* lib/target-libpath.exp (set_ld_library_path_env_vars): Log
LD_LIBRARY_PATH, LD_RUN_PATH, SHLIB_PATH, LD_LIBRARY_PATH_32,
LD_LIBRARY_PATH_64 and DYLD_LIBRARY_PATH.

diff --git a/gcc/testsuite/lib/target-libpath.exp 
b/gcc/testsuite/lib/target-libpath.exp
index 603ed8a..1891088 100644
--- a/gcc/testsuite/lib/target-libpath.exp
+++ b/gcc/testsuite/lib/target-libpath.exp
@@ -155,7 +155,12 @@ proc set_ld_library_path_env_vars { } {
 setenv DYLD_LIBRARY_PATH $ld_library_path
   }
 
-  verbose -log set_ld_library_path_env_vars: ld_library_path=$ld_library_path
+  verbose -log LD_LIBRARY_PATH=[getenv LD_LIBRARY_PATH]
+  verbose -log LD_RUN_PATH=[getenv LD_RUN_PATH]
+  verbose -log SHLIB_PATH=[getenv SHLIB_PATH]
+  verbose -log LD_LIBRARY_PATH_32=[getenv LD_LIBRARY_PATH_32]
+  verbose -log LD_LIBRARY_PATH_64=[getenv LD_LIBRARY_PATH_64]
+  verbose -log DYLD_LIBRARY_PATH=[getenv DYLD_LIBRARY_PATH]
 }
 
 ###

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz

2014-03-19 17:23 GMT+01:00 Mike Stump mikest...@comcast.net:
 On Mar 18, 2014, at 6:16 AM, Kai Tietz ktiet...@googlemail.com wrote:
 this patch skips anon2.C and anon3.C test for mingw target.  Issue
 here is that weak under pe-coff is different to ELF-targets and
 therefore test doesn't apply for

 So, what does the output look like?  There should be a trace of weak of some 
 sort in the output.

No, there is none.  Output looks like:

.seh_proc   _ZN2N43._91CIiE3fn2ES2_
_ZN2N43._91CIiE3fn2ES2_:
.LFB11:
.seh_endprologue
ret
.seh_endproc
.globl  _ZN2N41qE
.data
.align 8
_ZN2N41qE:
.quad   _ZN2N43._91CIiE3fn2ES2_
.globl  _ZN2N41pE
.align 8
_ZN2N41pE:
.quad   _ZN2N43._91CIiE3fn1ENS0_1BE
.globl  _ZN2N31qE
.align 8
_ZN2N31qE:
.quad   _ZN2N31D1CIiE3fn2ES2_...

The concept of weak - as present in ELF - isn't known in COFF in
general.  There is some weak, but it works only for static library and
in a limitted way.  Therefore we can't (and don't) use it for COFF
targets.

Kai

PS: I have another similiar reasoned patch for g++.dg/abi/thunk5.C on
my pile too.

Re: PATCH: PR target/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread Mike Stump

On Mar 19, 2014, at 8:41 AM, H.J. Lu hongjiu...@intel.com wrote:
 GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
 set_ld_library_path_env_vars sets a few environment variables including
 LD_RUN_PATH.  This patch logs all environment variables set by
 set_ld_library_path_env_vars so that one can recreate the same
 executable as make check run.  OK to install?

Ok.  If someone complains about the log size clutter, we can consider bumping 
it up to higher verbosity.

[jit] Tighten up the distinction between pointers and arrays

2014-03-19 Thread David Malcolm

Committed to branch dmalcolm/jit:

https://github.com/davidmalcolm/pygccjit/pull/3#issuecomment-37883129
showed a problem where a parameter expecting a (char *) was passed
a char[1024] cast to a (char *) as its argument, leading to an ICE:

libgccjit.so: internal compiler error: in convert_move, at expr.c:320
0x7fffebea98ad convert_move(rtx_def*, rtx_def*, int)
../../src/gcc/expr.c:320
0x7fffebec31cb expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, 
expand_modifier)
../../src/gcc/expr.c:8105
0x7fffec88d768 expand_gimple_stmt_1
../../src/gcc/cfgexpand.c:2321
0x7fffec88d9cc expand_gimple_stmt
../../src/gcc/cfgexpand.c:2381

The issue was that the recording::type::dereference method is used for
both pointers and for arrays, leading to sloppiness about where lvalues
and rvalues can be pointers vs arrays.

This commit introduces is_pointer and is_array methods, using them to
tighten up type-checking, converting the above ICE into an type-check
error when the cast is attempted:
  libgccjit.so: error: gcc_jit_context_new_cast: cannot cast buffer from type: 
char[1024] to type: char *

The correct way to use an array as a pointer in the JIT API is to use
   gcc_jit_lvalue_get_address
on the array, which gives you an rvalue representing the address of the
initial element, and then to cast that rvalue as necessary.

gcc/jit
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
accepts_writes_from): Accept writes from pointers, but not arrays.

* internal-api.h (gcc::jit::recording::type::is_pointer): New.
(gcc::jit::recording::type::is_array): New.
(gcc::jit::recording::memento_of_get_type::accepts_writes_from):
Allow (void *) to accept writes of pointers, but not arrays.
(gcc::jit::recording::memento_of_get_type::is_pointer): New.
(gcc::jit::recording::memento_of_get_type::is_array): New.
(gcc::jit::recording::memento_of_get_pointer::is_pointer): New.
(gcc::jit::recording::memento_of_get_pointer::is_array): New.
(gcc::jit::recording::memento_of_get_const::is_pointer): New.
(gcc::jit::recording::memento_of_get_const::is_array): New.
(gcc::jit::recording::memento_of_get_volatile::is_pointer): New.
(gcc::jit::recording::memento_of_get_volatile::is_array): New.
(gcc::jit::recording::array_type::is_pointer): New.
(gcc::jit::recording::array_type::is_array): New.
(gcc::jit::recording::function_type::is_pointer): New.
(gcc::jit::recording::function_type::is_array): New.
(gcc::jit::recording::struct_::is_pointer): New.
(gcc::jit::recording::struct_::is_array): New.

* libgccjit.c (gcc_jit_context_new_rvalue_from_ptr): Require the
pointer_type to be a pointer, not an array.
(gcc_jit_context_null): Likewise.
(is_valid_cast): Require pointer casts to be between pointer types,
not arrays.
(gcc_jit_context_new_array_access): Update error message from not
a pointer to not a pointer or array.
(gcc_jit_rvalue_dereference_field): Require the pointer arg to be
of pointer type, not an array.
(gcc_jit_rvalue_dereference): Likewise.

gcc/testsuite/
* jit.dg/test-array-as-pointer.c: New test case, verifying that
there's a way to treat arrays as pointers.
* jit.dg/test-combination.c: Add test-array-as-pointer.c...
(create_code): ...here and...
(verify_code): ...here.

* jit.dg/test-error-array-as-pointer.c: New test case, verifying
that bogus casts from array to pointer are caught by the type
system, rather than leading to ICEs seen in:
https://github.com/davidmalcolm/pygccjit/pull/3#issuecomment-37883129
---
 gcc/jit/ChangeLog.jit  |  35 +++
 gcc/jit/internal-api.c |   2 +-
 gcc/jit/internal-api.h |  18 +++-
 gcc/jit/libgccjit.c|  14 +--
 gcc/testsuite/ChangeLog.jit|  13 +++
 gcc/testsuite/jit.dg/test-array-as-pointer.c   | 101 +
 gcc/testsuite/jit.dg/test-combination.c|   9 ++
 gcc/testsuite/jit.dg/test-error-array-as-pointer.c |  99 
 8 files changed, 282 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-array-as-pointer.c
 create mode 100644 gcc/testsuite/jit.dg/test-error-array-as-pointer.c

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 8244eba..efb1931 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,38 @@
+2014-03-19  David Malcolm  dmalc...@redhat.com
+
+   * internal-api.c (gcc::jit::recording::memento_of_get_pointer::
+   accepts_writes_from): Accept writes from pointers, but not arrays.
+
+   * internal-api.h (gcc::jit::recording::type::is_pointer): New.
+

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Rainer Orth

Kai Tietz ktiet...@googlemail.com writes:

 2014-03-19 17:23 GMT+01:00 Mike Stump mikest...@comcast.net:
 On Mar 18, 2014, at 6:16 AM, Kai Tietz ktiet...@googlemail.com wrote:
 this patch skips anon2.C and anon3.C test for mingw target.  Issue
 here is that weak under pe-coff is different to ELF-targets and
 therefore test doesn't apply for

 So, what does the output look like?  There should be a trace of weak of
 some sort in the output.

 No, there is none.  Output looks like:

 .seh_proc   _ZN2N43._91CIiE3fn2ES2_
 _ZN2N43._91CIiE3fn2ES2_:
 .LFB11:
 .seh_endprologue
 ret
 .seh_endproc
 .globl  _ZN2N41qE
 .data
 .align 8
 _ZN2N41qE:
 .quad   _ZN2N43._91CIiE3fn2ES2_
 .globl  _ZN2N41pE
 .align 8
 _ZN2N41pE:
 .quad   _ZN2N43._91CIiE3fn1ENS0_1BE
 .globl  _ZN2N31qE
 .align 8
 _ZN2N31qE:
 .quad   _ZN2N31D1CIiE3fn2ES2_...

 The concept of weak - as present in ELF - isn't known in COFF in
 general.  There is some weak, but it works only for static library and
 in a limitted way.  Therefore we can't (and don't) use it for COFF
 targets.

In that case, it seems far better to have
gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
instead of lying about weak support.

This way, everything else simply falls into place; no need to
special-case many individual testcases.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump

On Mar 19, 2014, at 9:49 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote:
 The concept of weak - as present in ELF - isn't known in COFF in
 general.  There is some weak, but it works only for static library and
 in a limitted way.  Therefore we can't (and don't) use it for COFF
 targets.
 
 In that case, it seems far better to have
 gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
 instead of lying about weak support.

Yeah, this is the direction I was headed…  :-)

[PATCH 2/2, AARCH64] Test case changes: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Venkataramanan Kumar

Hi Marcus,

On 14 March 2014 19:42, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:

 Do we need a new effective target test, why is the existing
 fstack_protector not appropriate?

 stack_protector does a run time test. It failed in cross compilation
 environment and these are compile only tests.

 This works fine in my cross environment, how does yours fail?


 Also I thought  richard suggested  me to add a new option for this.
 ref: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03358.html

 I read that comment to mean use an effective target test instead of
 matching triples. I don't see that re-using an existing effective
 target test contradicts that suggestion.

 Looking through the test suite I see that there are:

 6 tests that use dg-do compile with dg-require-effective-target 
 fstack_protector

 4 tests that use dg-do run with dg-require-effective-target fstack_protector

 2 tests that use dg-do run {target native} dg-require-effective-target
 fstack_protector

 and finally the 2 tests we are discussing that use dg-compile with a
 triple test.

 so there are already tests in the testsuite that use dg-do compile
 with the existing effective target test.

 I see no immediately obvious reason why the two tests that require
 target native require the native constraint... but I guess that is a
 different issue.


I used the existing dg-require-effective-target check,
stack_protector and added it in a separate line.

ChangeLog.

2014-03-19  Venkataramanan Kumar  venkataramanan.ku...@linaro.org
* g++.dg/fstack-protector-strong.C: Add effetive target check for
  stack protection.
* gcc.dg/fstack-protector-strong.c: Likewise.

These two tests are passing now for aarch64-none-linux-gnu target under QEMU.

Let me know if I can upstream these two patches.

regards,
Venkat.
Index: gcc/testsuite/g++.dg/fstack-protector-strong.C
===
--- gcc/testsuite/g++.dg/fstack-protector-strong.C  (revision 208609)
+++ gcc/testsuite/g++.dg/fstack-protector-strong.C  (working copy)
@@ -1,7 +1,8 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-do compile } */
 /* { dg-options -O2 -fstack-protector-strong } */
+/* { dg-require-effective-target fstack_protector } */
 
 class A
 {
Index: gcc/testsuite/gcc.dg/fstack-protector-strong.c
===
--- gcc/testsuite/gcc.dg/fstack-protector-strong.c  (revision 208609)
+++ gcc/testsuite/gcc.dg/fstack-protector-strong.c  (working copy)
@@ -1,7 +1,8 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */
+/* { dg-do compile } */
 /* { dg-options -O2 -fstack-protector-strong } */
+/* { dg-require-effective-target fstack_protector } */
 
 #includestring.h

[C++ Patch] PR 60384

2014-03-19 Thread Paolo Carlini


Hi,

in this minor regression we ICE during error recovery, when 
push_class_level_binding_1 (called by
finish_member_declaration via pushdecl_class_level) gets a 
TEMPLATE_ID_EXPR as the name argument. It's a regression because, since 
r199779, invalid declarations get more often through (with TREE_TYPE an 
error_mark_node, like TREE_TYPE (x) in the case at issue). Thus the 
additional check I'm suggesting. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2014-03-19  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60384
* name-lookup.c (push_class_level_binding_1): Check identifier_p
on the name argument.

/testsuite
2014-03-19  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60384
* g++.dg/cpp1y/pr60384.C: New.
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 208682)
+++ cp/name-lookup.c(working copy)
@@ -3112,7 +3112,9 @@ push_class_level_binding_1 (tree name, tree x)
   if (!class_binding_level)
 return true;
 
-  if (name == error_mark_node)
+  if (name == error_mark_node
+  /* Can happen for an erroneous declaration (c++/60384).  */
+  || !identifier_p (name))
 return false;
 
   /* Check for invalid member names.  But don't worry about a default
Index: testsuite/g++.dg/cpp1y/pr60384.C
===
--- testsuite/g++.dg/cpp1y/pr60384.C(revision 0)
+++ testsuite/g++.dg/cpp1y/pr60384.C(working copy)
@@ -0,0 +1,9 @@
+// PR c++/60384
+// { dg-do compile { target c++1y } }
+
+templatetypename int foo();
+
+struct A
+{
+  typedef auto foo();  // { dg-error typedef declared 'auto' }
+};

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Joseph S. Myers

On Wed, 19 Mar 2014, Kai Tietz wrote:

 The concept of weak - as present in ELF - isn't known in COFF in
 general.  There is some weak, but it works only for static library and
 in a limitted way.  Therefore we can't (and don't) use it for COFF
 targets.

There are already two different checks (check_weak_available and 
check_weak_override_available), reflecting what different testcases need.  
Is the requirement for these tests logically different from both of those?  
If so, maybe there should be a third such check (even if in fact it does 
the same thing as check_weak_override_available).

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH 1/2, AARCH64]: Machine descriptions: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Venkataramanan Kumar

Hi Marcus,

On 14 March 2014 19:42, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 Hi Venkat

 On 5 February 2014 10:29, Venkataramanan Kumar
 venkataramanan.ku...@linaro.org wrote:
 Hi Marcus,

 +  ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0
 +  [(set_attr length 12)])

 This pattern emits an opaque sequence of instructions that cannot be
 scheduled, is that necessary? Can we not expand individual
 instructions or at least split ?

 Almost all the ports emits a template of assembly instructions.
 I m not sure why they have to be generated this way.
 But usage of these pattern is to clear the register that holds canary
 value immediately after its usage.

 I've just read the thread Andrew pointed out, thanks, I'm happy that
 there is a good reason to do it this way.  Andrew, thanks for
 providing the background.

 +  [(set_attr length 12)])
 +

 These patterns should also set the type attribute,  a reasonable
 value would be multiple.


I have incorporated your review comments and split the patch into two.

The first patch attached here contains Aarch64 machine descriptions
for the stack protect patterns.

ChangeLog.

2014-03-19 Venkataramanan Kumar  venkataramanan.ku...@linaro.org
* config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
(stack_protect_set_mode, stack_protect_test_mode): Add
machine descriptions for Stack Smashing Protector.

Tested  for aarch64-none-linux-gnu target under QEMU .

regards,
Venkat.
Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   (revision 208609)
+++ gcc/config/aarch64/aarch64.md   (working copy)
@@ -102,6 +102,8 @@
 UNSPEC_TLSDESC
 UNSPEC_USHL_2S
 UNSPEC_VSTRUCTDUMMY
+UNSPEC_SP_SET
+UNSPEC_SP_TEST
 ])
 
 (define_c_enum unspecv [
@@ -3634,6 +3636,67 @@
   DONE;
 })
 
+;; Named patterns for stack smashing protection.
+(define_expand stack_protect_set
+  [(match_operand 0 memory_operand)
+   (match_operand 1 memory_operand)]
+  
+{
+  enum machine_mode mode = GET_MODE (operands[0]);
+
+  emit_insn ((mode == DImode
+ ? gen_stack_protect_set_di
+ : gen_stack_protect_set_si) (operands[0], operands[1]));
+  DONE;
+})
+
+(define_insn stack_protect_set_mode
+  [(set (match_operand:PTR 0 memory_operand =m)
+   (unspec:PTR [(match_operand:PTR 1 memory_operand m)]
+UNSPEC_SP_SET))
+   (set (match_scratch:PTR 2 =r) (const_int 0))]
+  
+  ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0
+  [(set_attr length 12)
+   (set_attr type multiple)])
+
+(define_expand stack_protect_test
+  [(match_operand 0 memory_operand)
+   (match_operand 1 memory_operand)
+   (match_operand 2)]
+  
+{
+
+  rtx result = gen_reg_rtx (Pmode);
+
+  enum machine_mode mode = GET_MODE (operands[0]);
+
+  emit_insn ((mode == DImode
+ ? gen_stack_protect_test_di
+ : gen_stack_protect_test_si) (result,
+   operands[0],
+   operands[1]));
+
+  if (mode == DImode)
+emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
+   result, const0_rtx, operands[2]));
+  else
+emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
+   result, const0_rtx, operands[2]));
+  DONE;
+})
+
+(define_insn stack_protect_test_mode
+  [(set (match_operand:PTR 0 register_operand)
+   (unspec:PTR [(match_operand:PTR 1 memory_operand m)
+(match_operand:PTR 2 memory_operand m)]
+UNSPEC_SP_TEST))
+   (clobber (match_scratch:PTR 3 =r))]
+  
+  ldr\t%x3, %x1\;ldr\t%x0, %x2\;eor\t%x0, %x3, %x0
+  [(set_attr length 12)
+   (set_attr type multiple)])
+
 ;; AdvSIMD Stuff
 (include aarch64-simd.md)

Re: [PATCH 1/2, AARCH64]: Machine descriptions: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Marcus Shawcroft

On 19 March 2014 17:11, Venkataramanan Kumar
venkataramanan.ku...@linaro.org wrote:

 I have incorporated your review comments and split the patch into two.

 The first patch attached here contains Aarch64 machine descriptions
 for the stack protect patterns.

 ChangeLog.

 2014-03-19 Venkataramanan Kumar  venkataramanan.ku...@linaro.org
 * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
 (stack_protect_set_mode, stack_protect_test_mode): Add
 machine descriptions for Stack Smashing Protector.

 Tested  for aarch64-none-linux-gnu target under QEMU .

 regards,
 Venkat.


Hi, This is OK for stage-1.
Thanks
/Marcus

Re: [RFA jit 2/2] introduce scoped_timevar

2014-03-19 Thread Tom Tromey

 Trevor == Trevor Saunders tsaund...@mozilla.com writes:

Trevor thanks for doing this.  I wonder about naming, we already have
Trevor auto_vec and while I don't really care wether we use auto_ or
Trevor scoped_ it seems like being consistant would be nice.

Sounds reasonable to me, I've made this change for v2.

Tom

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz

2014-03-19 18:37 GMT+01:00 Joseph S. Myers jos...@codesourcery.com:
 On Wed, 19 Mar 2014, Kai Tietz wrote:

 The concept of weak - as present in ELF - isn't known in COFF in
 general.  There is some weak, but it works only for static library and
 in a limitted way.  Therefore we can't (and don't) use it for COFF
 targets.

 There are already two different checks (check_weak_available and
 check_weak_override_available), reflecting what different testcases need.
 Is the requirement for these tests logically different from both of those?
 If so, maybe there should be a third such check (even if in fact it does
 the same thing as check_weak_override_available).

 --
 Joseph S. Myers
 jos...@codesourcery.com

On a second thought the disabling of weak-available for mingw-targets
seems to be wrong.  Actually weak is present.  It just has a different
meaning.
Those testcases are - AFAIU them - actually checking that weaks are
available. Nevertheless the check here intends to probe if
weak-override is possible.  As otherwise weaks make no sense here
AFAICS.
I don't think that we need to add a third check here. It might be
enough to check for weak-override-available instead for those tests.

Kai

Re: [4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt

Oops.  Please ignore this for now.  I'm preparing a patch series and
sent this one prematurely.

Thanks,
Bill

On Wed, 2014-03-19 at 10:25 -0500, Bill Schmidt wrote:
 Hi,
 
 This patch (diff-le-tests) backports adjustments to a few tests for
 powerpc64le and the ELFv2 ABI.
 
 Thanks,
 Bill

[4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-tests) backports adjustments to a few tests for
powerpc64le and the ELFv2 ABI.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-11-27  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gfortran.dg/nan_7.f90: Disable for little endian PowerPC.

Backport from mainline r205106:

2013-11-20  Ulrich Weigand  ulrich.weig...@de.ibm.com

* gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe.

Backport from mainline r205046:

2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

* gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to
construct parameter slot value in endian-independent way.
(fcevv, fciievv, fcvevv): Use it.


Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c   
2013-12-28 17:50:39.655337721 +0100
@@ -119,6 +119,12 @@ typedef union
   vector int v;
 } vector_int_t;
 
+#ifdef __LITTLE_ENDIAN__
+#define MAKE_SLOT(x, y) ((long)x | ((long)y  32))
+#else
+#define MAKE_SLOT(x, y) ((long)y | ((long)x  32))
+#endif
+
 /* Paramter passing.
s : gpr 3
v : vpr 2
@@ -226,8 +232,8 @@ fcevv (char *s, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[2].l != 0x10002ULL
-  || sp-slot[4].l != 0x50006ULL)
+  if (sp-slot[2].l != MAKE_SLOT (1, 2)
+  || sp-slot[4].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -268,8 +274,8 @@ fciievv (char *s, int i, int j, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[4].l != 0x10002ULL
-  || sp-slot[6].l != 0x50006ULL)
+  if (sp-slot[4].l != MAKE_SLOT (1, 2)
+  || sp-slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -296,8 +302,8 @@ fcvevv (char *s, vector int x, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[4].l != 0x10002ULL
-  || sp-slot[6].l != 0x50006ULL)
+  if (sp-slot[4].l != MAKE_SLOT (1, 2)
+  || sp-slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c   
2013-12-28 17:50:39.659337741 +0100
@@ -11,7 +11,11 @@ int  msw(long long in)
 int  i[2];
   } ud;
   ud.ll = in;
+#ifdef __LITTLE_ENDIAN__
+  return ud.i[1];
+#else
   return ud.i[0];
+#endif
 }
 
 int main()
Index: gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90
===
--- gcc-4_8-branch.orig/gcc/testsuite/gfortran.dg/nan_7.f90 2013-12-28 
17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90  2013-12-28 
17:50:39.662337756 +0100
@@ -2,6 +2,7 @@
 ! { dg-options -fno-range-check }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-require-effective-target fortran_integer_16 }
+! { dg-skip-if  { powerpc*le-*-* } { * } {  } }
 ! PR47293 NAN not correctly read
 character(len=200) :: str
 real(16) :: r

[RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey

This patch introduces a new class toplev and changes toplev_main and
toplev_finalize to be methods of this class.  Additionally, now the
timevars are automatically stopped when the object is destroyed.  This
cleans up compile a bit and makes it simpler to reuse the toplev
logic in other code.
---
 gcc/ChangeLog.jit  | 14 +
 gcc/diagnostic.c   |  2 +-
 gcc/jit/ChangeLog.jit  |  5 +
 gcc/jit/internal-api.c | 25 +-
 gcc/main.c |  9 
 gcc/toplev.c   | 56 +-
 gcc/toplev.h   | 20 --
 7 files changed, 76 insertions(+), 55 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 77ac44c..c590ab1 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,17 @@
+2014-03-19  Tom Tromey  tro...@redhat.com
+
+   * diagnostic.c (bt_stop): Use toplev::main.
+   * main.c (main): Update.
+   * toplev.c (do_compile): Remove argument.  Don't check
+   use_TV_TOTAL.
+   (toplev::toplev, toplev::~toplev, toplev::start_timevars): New
+   functions.
+   (toplev::main): Rename from toplev_main.  Update.
+   (toplev::finalize): Rename from toplev_finalize.  Update.
+   * toplev.h (class toplev): New.
+   (struct toplev_options): Remove.
+   (toplev_main, toplev_finalize): Don't declare.
+
 2014-03-11  David Malcolm  dmalc...@redhat.com
 
* gcse.c (gcse_c_finalize): New, to clear test_insn between
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..56dc3ac 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -333,7 +333,7 @@ diagnostic_show_locus (diagnostic_context * context,
 static const char * const bt_stop[] =
 {
   main,
-  toplev_main,
+  toplev::main,
   execute_one_pass,
   compile_file,
 };
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..e45d38c 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,8 @@
+2014-03-19  Tom Tromey  tro...@redhat.com
+
+   * internal-api.c (compile): Use toplev, not toplev_options.
+   Simplify.
+
 2014-03-19  David Malcolm  dmalc...@redhat.com
 
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..95978bf 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3650,7 +3650,7 @@ compile ()
 
   /* Call into the rest of gcc.
  For now, we have to assemble command-line options to pass into
- toplev_main, so that they can be parsed. */
+ toplev::main, so that they can be parsed. */
 
   /* Pass in user-provided progname, if any, so that it makes it
  into GCC's progname global, used in various diagnostics. */
@@ -3724,25 +3724,15 @@ compile ()
   ADD_ARG (-fdump-ipa-all);
 }
 
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = false;
+  toplev toplev (false);
 
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-
-  toplev_main (num_args, const_cast char ** (fake_args), toplev_opts);
-  toplev_finalize ();
+  toplev.main (num_args, const_cast char ** (fake_args));
+  toplev.finalize ();
 
   active_playback_ctxt = NULL;
 
   if (errors_occurred ())
-{
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-  return NULL;
-}
+return NULL;
 
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
@@ -3765,8 +3755,6 @@ compile ()
 if (ret)
   {
timevar_pop (TV_ASSEMBLE);
-   timevar_stop (TV_TOTAL);
-   timevar_print (stderr);
return NULL;
   }
   }
@@ -3795,9 +3783,6 @@ compile ()
 timevar_pop (TV_LOAD);
   }
 
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-
   return result_obj;
 }
 
diff --git a/gcc/main.c b/gcc/main.c
index b893308..4bba041 100644
--- a/gcc/main.c
+++ b/gcc/main.c
@@ -1,5 +1,5 @@
 /* main.c: defines main() for cc1, cc1plus, etc.
-   Copyright (C) 2007-2013 Free Software Foundation, Inc.
+   Copyright (C) 2007-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -26,15 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 
 int main (int argc, char **argv);
 
-/* We define main() to call toplev_main(), which is defined in toplev.c.
+/* We define main() to call toplev::main(), which is defined in toplev.c.
We do this in a separate file in order to allow the language front-end
to define a different main(), if it so desires.  */
 
 int
 main (int argc, char **argv)
 {
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = true;
+  toplev toplev (true);
 
-  return toplev_main (argc, argv, toplev_opts);
+  return toplev.main (argc, argv);
 }
diff --git a/gcc/toplev.c b/gcc/toplev.c
index f1ac560..5284621 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1,5 +1,5 @@
 /* Top level of GCC compilers (cc1, cc1plus, etc.)
-   Copyright (C) 1987-2013 Free

[RFA jit v2 0/2] minor refactorings for reuse

2014-03-19 Thread Tom Tromey

Here's a second revision of my patches to the jit branch to clean up
toplev and timevar uses a bit.  The first revision was here:

http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00895.html

Compared with that revision, this one hopefully includes the
ChangeLog.jit entries; and I took Trevor's suggestion and renamed the
timevar class to auto_timevar.

Tom

[RFA jit v2 2/2] introduce auto_timevar

2014-03-19 Thread Tom Tromey

This introduces a new auto_timevar class.  It pushes a given timevar
in its constructor, and pops it in the destructor, giving a much
simpler way to use timevars in the typical case where they can be
scoped.
---
 gcc/ChangeLog.jit  |  4 
 gcc/jit/ChangeLog.jit  |  4 
 gcc/jit/internal-api.c | 16 +---
 gcc/timevar.h  | 26 +-
 4 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index c590ab1..ee1df88 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-03-19  Tom Tromey  tro...@redhat.com
 
+   * timevar.h (auto_timevar): New class.
+
+2014-03-19  Tom Tromey  tro...@redhat.com
+
* diagnostic.c (bt_stop): Use toplev::main.
* main.c (main): Update.
* toplev.c (do_compile): Remove argument.  Don't check
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index e45d38c..69f2412 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-03-19  Tom Tromey  tro...@redhat.com
 
+   * internal-api.c (compile): Use auto_timevar.
+
+2014-03-19  Tom Tromey  tro...@redhat.com
+
* internal-api.c (compile): Use toplev, not toplev_options.
Simplify.
 
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 95978bf..090d351 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3737,8 +3737,6 @@ compile ()
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
 
-  timevar_push (TV_ASSEMBLE);
-
   /* Gross hacks follow:
  We have a .s file; we want a .so file.
  We could reuse parts of gcc/gcc.c to do this.
@@ -3746,6 +3744,8 @@ compile ()
*/
   /* FIXME: totally faking it for now, not even using pex */
   {
+auto_timevar assemble_timevar (TV_ASSEMBLE);
+
 char cmd[1024];
 snprintf (cmd, 1024, gcc -shared %s -o %s,
   m_path_s_file, m_path_so_file);
@@ -3753,20 +3753,16 @@ compile ()
   printf (cmd: %s\n, cmd);
 int ret = system (cmd);
 if (ret)
-  {
-   timevar_pop (TV_ASSEMBLE);
-   return NULL;
-  }
+  return NULL;
   }
-  timevar_pop (TV_ASSEMBLE);
 
   // TODO: split out assembles vs linker
 
   /* dlopen the .so file. */
   {
-const char *error;
+auto_timevar load_timevar (TV_LOAD);
 
-timevar_push (TV_LOAD);
+const char *error;
 
 /* Clear any existing error.  */
 dlerror ();
@@ -3779,8 +3775,6 @@ compile ()
   result_obj = new result (handle);
 else
   result_obj = NULL;
-
-timevar_pop (TV_LOAD);
   }
 
   return result_obj;
diff --git a/gcc/timevar.h b/gcc/timevar.h
index dc2a8bc..f018e39 100644
--- a/gcc/timevar.h
+++ b/gcc/timevar.h
@@ -1,5 +1,5 @@
 /* Timing variables for measuring compiler performance.
-   Copyright (C) 2000-2013 Free Software Foundation, Inc.
+   Copyright (C) 2000-2014 Free Software Foundation, Inc.
Contributed by Alex Samuel sam...@codesourcery.com
 
This file is part of GCC.
@@ -110,6 +110,30 @@ timevar_pop (timevar_id_t tv)
 timevar_pop_1 (tv);
 }
 
+// This is a simple timevar wrapper class that pushes a timevar in its
+// constructor and pops the timevar in its destructor.
+class auto_timevar
+{
+ public:
+  auto_timevar (timevar_id_t tv)
+: m_tv (tv)
+  {
+timevar_push (m_tv);
+  }
+
+  ~auto_timevar ()
+  {
+timevar_pop (m_tv);
+  }
+
+ private:
+
+  // Private to disallow copies.
+  auto_timevar (const auto_timevar );
+
+  timevar_id_t m_tv;
+};
+
 extern void print_time (const char *, long);
 
 #endif /* ! GCC_TIMEVAR_H */
-- 
1.8.5.3

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump

On Mar 19, 2014, at 9:38 AM, Kai Tietz ktiet...@googlemail.com wrote:
 2014-03-19 17:23 GMT+01:00 Mike Stump mikest...@comcast.net:
 On Mar 18, 2014, at 6:16 AM, Kai Tietz ktiet...@googlemail.com wrote:
 this patch skips anon2.C and anon3.C test for mingw target.  Issue
 here is that weak under pe-coff is different to ELF-targets and
 therefore test doesn't apply for
 
 So, what does the output look like?  There should be a trace of weak of some 
 sort in the output.
 
 No, there is none.

So, does the target support weak?

Re: [PATCH 2/2, AARCH64] Test case changes: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Marcus Shawcroft

On 19 March 2014 17:18, Venkataramanan Kumar
venkataramanan.ku...@linaro.org wrote:

 I used the existing dg-require-effective-target check,
 stack_protector and added it in a separate line.

 ChangeLog.

 2014-03-19  Venkataramanan Kumar  venkataramanan.ku...@linaro.org
 * g++.dg/fstack-protector-strong.C: Add effetive target check for
   stack protection.
 * gcc.dg/fstack-protector-strong.c: Likewise.

 These two tests are passing now for aarch64-none-linux-gnu target under QEMU.


Venkat,

I think this change is reasonable (for stage-1) but I'd like one of
the testsuite maintainers to ACK the change.

Cheers
/Marcus

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz

2014-03-19 17:54 GMT+01:00 Mike Stump mikest...@comcast.net:
 On Mar 19, 2014, at 9:49 AM, Rainer Orth r...@cebitec.uni-bielefeld.de 
 wrote:
 The concept of weak - as present in ELF - isn't known in COFF in
 general.  There is some weak, but it works only for static library and
 in a limitted way.  Therefore we can't (and don't) use it for COFF
 targets.

 In that case, it seems far better to have
 gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
 instead of lying about weak support.

 Yeah, this is the direction I was headed...  :-)

Ok, I will sent a patch for changing target-support.exp.

And yes, target supports a kind of weak, but not the expected gnu-weak.

Thanks,
Kai

[jit] Avoid shadowing progname global

2014-03-19 Thread David Malcolm

Committed to branch dmalcolm/jit:

gcc/jit/
* internal-api.c (gcc::jit::recording::context::add_error_va):
Rename local progname to ctxt_progname to avoid shadowing
the related global, for clarity.
(gcc::jit::playback::context::compile): Likewise.
---
 gcc/jit/ChangeLog.jit  |  7 +++
 gcc/jit/internal-api.c | 22 --
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..265242e 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,12 @@
 2014-03-19  David Malcolm  dmalc...@redhat.com
 
+   * internal-api.c (gcc::jit::recording::context::add_error_va):
+   Rename local progname to ctxt_progname to avoid shadowing
+   the related global, for clarity.
+   (gcc::jit::playback::context::compile): Likewise.
+
+2014-03-19  David Malcolm  dmalc...@redhat.com
+
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
accepts_writes_from): Accept writes from pointers, but not arrays.
 
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..819800a 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -610,18 +610,19 @@ recording::context::add_error_va (location *loc, const 
char *fmt, va_list ap)
   char buf[1024];
   vsnprintf (buf, sizeof (buf) - 1, fmt, ap);
 
-  const char *progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
-  if (!progname)
-progname = libgccjit.so;
+  const char *ctxt_progname =
+get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
+  if (!ctxt_progname)
+ctxt_progname = libgccjit.so;
 
   if (loc)
 fprintf (stderr, %s: %s: error: %s\n,
-progname,
+ctxt_progname,
 loc-get_debug_string (),
 buf);
   else
 fprintf (stderr, %s: error: %s\n,
-progname,
+ctxt_progname,
 buf);
 
   if (!m_error_count)
@@ -3629,8 +3630,8 @@ playback::context::
 compile ()
 {
   void *handle = NULL;
+  const char *ctxt_progname;
   result *result_obj = NULL;
-  const char *progname;
   const char *fake_args[20];
   unsigned int num_args;
 
@@ -3652,10 +3653,11 @@ compile ()
  For now, we have to assemble command-line options to pass into
  toplev_main, so that they can be parsed. */
 
-  /* Pass in user-provided progname, if any, so that it makes it
- into GCC's progname global, used in various diagnostics. */
-  progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
-  fake_args[0] = progname ? progname : libgccjit.so;
+  /* Pass in user-provided program name as argv0, if any, so that it
+ makes it into GCC's progname global, used in various diagnostics. */
+  ctxt_progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
+  fake_args[0] =
+(ctxt_progname ? ctxt_progname : libgccjit.so);
 
   fake_args[1] = m_path_c_file;
   num_args = 2;
-- 
1.8.5.3

Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey

 Tom == Tom Tromey tro...@redhat.com writes:

Tom This patch introduces a new class toplev and changes toplev_main and
Tom toplev_finalize to be methods of this class.  Additionally, now the
Tom timevars are automatically stopped when the object is destroyed.  This
Tom cleans up compile a bit and makes it simpler to reuse the toplev
Tom logic in other code.

David asked me off-list to rename the field in class toplev, so here's a
new patch that does this.

Tom

commit 66f92863ef55c26f673d02dd39027f340940a3bf
Author: Tom Tromey tro...@redhat.com
Date:   Tue Mar 18 08:07:40 2014 -0600

introduce class toplev

This patch introduces a new class toplev and changes toplev_main and
toplev_finalize to be methods of this class.  Additionally, now the
timevars are automatically stopped when the object is destroyed.  This
cleans up compile a bit and makes it simpler to reuse the toplev
logic in other code.

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 77ac44c..c590ab1 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,17 @@
+2014-03-19  Tom Tromey  tro...@redhat.com
+
+   * diagnostic.c (bt_stop): Use toplev::main.
+   * main.c (main): Update.
+   * toplev.c (do_compile): Remove argument.  Don't check
+   use_TV_TOTAL.
+   (toplev::toplev, toplev::~toplev, toplev::start_timevars): New
+   functions.
+   (toplev::main): Rename from toplev_main.  Update.
+   (toplev::finalize): Rename from toplev_finalize.  Update.
+   * toplev.h (class toplev): New.
+   (struct toplev_options): Remove.
+   (toplev_main, toplev_finalize): Don't declare.
+
 2014-03-11  David Malcolm  dmalc...@redhat.com
 
* gcse.c (gcse_c_finalize): New, to clear test_insn between
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..56dc3ac 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -333,7 +333,7 @@ diagnostic_show_locus (diagnostic_context * context,
 static const char * const bt_stop[] =
 {
   main,
-  toplev_main,
+  toplev::main,
   execute_one_pass,
   compile_file,
 };
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..e45d38c 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,8 @@
+2014-03-19  Tom Tromey  tro...@redhat.com
+
+   * internal-api.c (compile): Use toplev, not toplev_options.
+   Simplify.
+
 2014-03-19  David Malcolm  dmalc...@redhat.com
 
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..95978bf 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3650,7 +3650,7 @@ compile ()
 
   /* Call into the rest of gcc.
  For now, we have to assemble command-line options to pass into
- toplev_main, so that they can be parsed. */
+ toplev::main, so that they can be parsed. */
 
   /* Pass in user-provided progname, if any, so that it makes it
  into GCC's progname global, used in various diagnostics. */
@@ -3724,25 +3724,15 @@ compile ()
   ADD_ARG (-fdump-ipa-all);
 }
 
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = false;
+  toplev toplev (false);
 
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-
-  toplev_main (num_args, const_cast char ** (fake_args), toplev_opts);
-  toplev_finalize ();
+  toplev.main (num_args, const_cast char ** (fake_args));
+  toplev.finalize ();
 
   active_playback_ctxt = NULL;
 
   if (errors_occurred ())
-{
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-  return NULL;
-}
+return NULL;
 
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
@@ -3765,8 +3755,6 @@ compile ()
 if (ret)
   {
timevar_pop (TV_ASSEMBLE);
-   timevar_stop (TV_TOTAL);
-   timevar_print (stderr);
return NULL;
   }
   }
@@ -3795,9 +3783,6 @@ compile ()
 timevar_pop (TV_LOAD);
   }
 
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-
   return result_obj;
 }
 
diff --git a/gcc/main.c b/gcc/main.c
index b893308..4bba041 100644
--- a/gcc/main.c
+++ b/gcc/main.c
@@ -1,5 +1,5 @@
 /* main.c: defines main() for cc1, cc1plus, etc.
-   Copyright (C) 2007-2013 Free Software Foundation, Inc.
+   Copyright (C) 2007-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -26,15 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 
 int main (int argc, char **argv);
 
-/* We define main() to call toplev_main(), which is defined in toplev.c.
+/* We define main() to call toplev::main(), which is defined in toplev.c.
We do this in a separate file in order to allow the language front-end
to define a different main(), if it so desires.  */
 
 int
 main (int argc, char **argv)
 {
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = true;
+  toplev toplev (true);
 
-  return toplev_main (argc, argv,

[PATCH, ARM] Optimise NotDI AND/OR ZeroExtendSI for ARMv7A

2014-03-19 Thread Ian Bolton

This is a follow-on patch to one already committed:
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01128.html

It implements patterns to simplify our RTL as follows:

OR (Not:DI (A:DI), ZeroExtend:DI (B:SI))
  --  the top half can be done with a MVN

AND (Not:DI (A:DI), ZeroExtend:DI (B:SI))
  --  the top half becomes zero.

I've added test cases for both of these and also the existing
anddi_notdi patterns.  The tests all pass.

Full regression runs passed.

OK for stage 1?

Cheers,
Ian


2014-03-19  Ian Bolton  ian.bol...@arm.com

gcc/
* config/arm/arm.md (*anddi_notdi_zesidi): New pattern
* config/arm/thumb2.md (*iordi_notdi_zesidi): New pattern.

testsuite/
* gcc.target/arm/anddi_notdi-1.c: New test.
* gcc.target/arm/iordi_notdi-1.c: New test case.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 2ddda02..d2d85ee 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2962,6 +2962,28 @@
(set_attr type multiple)]
 )
 
+(define_insn_and_split *anddi_notdi_zesidi
+  [(set (match_operand:DI 0 s_register_operand =r,r)
+(and:DI (not:DI (match_operand:DI 2 s_register_operand 0,?r))
+(zero_extend:DI
+ (match_operand:SI 1 s_register_operand r,r]
+  TARGET_32BIT
+  #
+  TARGET_32BIT  reload_completed
+  [(set (match_dup 0) (and:SI (not:SI (match_dup 2)) (match_dup 1)))
+   (set (match_dup 3) (const_int 0))]
+  
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[2] = gen_lowpart (SImode, operands[2]);
+  }
+  [(set_attr length 8)
+   (set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr type multiple)]
+)
+
 (define_insn_and_split *anddi_notsesidi_di
   [(set (match_operand:DI 0 s_register_operand =r,r)
(and:DI (not:DI (sign_extend:DI
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 467c619..10bc8b1 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1418,6 +1418,30 @@
(set_attr type multiple)]
 )
 
+(define_insn_and_split *iordi_notdi_zesidi
+  [(set (match_operand:DI 0 s_register_operand =r,r)
+   (ior:DI (not:DI (match_operand:DI 2 s_register_operand 0,?r))
+   (zero_extend:DI
+(match_operand:SI 1 s_register_operand r,r]
+  TARGET_THUMB2
+  #
+  TARGET_THUMB2  reload_completed
+  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
+   (set (match_dup 3) (not:SI (match_dup 4)))]
+  
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[1] = gen_lowpart (SImode, operands[1]);
+operands[4] = gen_highpart (SImode, operands[2]);
+operands[2] = gen_lowpart (SImode, operands[2]);
+  }
+  [(set_attr length 8)
+   (set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr type multiple)]
+)
+
 (define_insn_and_split *iordi_notsesidi_di
   [(set (match_operand:DI 0 s_register_operand =r,r)
(ior:DI (not:DI (sign_extend:DI
diff --git a/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c 
b/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c
new file mode 100644
index 000..cfb33fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options -O2 -fno-inline --save-temps } */
+
+extern void abort (void);
+
+typedef long long s64int;
+typedef int s32int;
+typedef unsigned long long u64int;
+typedef unsigned int u32int;
+
+s64int
+anddi_di_notdi (s64int a, s64int b)
+{
+  return (a  ~b);
+}
+
+s64int
+anddi_di_notzesidi (s64int a, u32int b)
+{
+  return (a  ~(u64int) b);
+}
+
+s64int
+anddi_notdi_zesidi (s64int a, u32int b)
+{
+  return (~a  (u64int) b);
+}
+
+s64int
+anddi_di_notsesidi (s64int a, s32int b)
+{
+  return (a  ~(s64int) b);
+}
+
+int main ()
+{
+  s64int a64 = 0xdeadbeefll;
+  s64int b64 = 0x5f470112ll;
+  s64int c64 = 0xdeadbeef300fll;
+
+  u32int c32 = 0x01124f4f;
+  s32int d32 = 0xabbaface;
+
+  s64int z = anddi_di_notdi (c64, b64);
+  if (z != 0xdeadbeef2008ll)
+abort ();
+
+  z = anddi_di_notzesidi (a64, c32);
+  if (z != 0xdeadbeefb0b0ll)
+abort ();
+
+  z = anddi_notdi_zesidi (c64, c32);
+  if (z != 0x01104f4fll)
+abort ();
+
+  z = anddi_di_notsesidi (a64, d32);
+  if (z != 0x0531ll)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times bic\t 6 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c 
b/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
index cda9c0e..249f080 100644
--- a/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
+++ b/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
@@ -9,19 +9,25 @@ typedef unsigned long long u64int;
 typedef unsigned int u32int;
 
 s64int
-iordi_notdi (s64int a, s64int b)
+iordi_di_notdi (s64int a, s64int b)
 {
   return (a | ~b);
 }
 
 s64int
-iordi_notzesidi (s64int a, u32int b)

Re: [PATCH] Fix PR60505

2014-03-19 Thread Cong Hou

On Tue, Mar 18, 2014 at 4:43 AM, Richard Biener rguent...@suse.de wrote:

 On Mon, 17 Mar 2014, Cong Hou wrote:

  On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener rguent...@suse.de wrote:
   On Fri, 14 Mar 2014, Cong Hou wrote:
  
   On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener rguent...@suse.de 
   wrote:
On Fri, 14 Mar 2014, Jakub Jelinek wrote:
   
On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
  Consider this fact and if there are alias checks, we can safely 
  remove
  the epilogue if the maximum trip count of the loop is less than or
  equal to the calculated threshold.

 You have to consider n % vf != 0, so an argument on only maximum
 trip count or threshold cannot work.
   
Well, if you only check if maximum trip count is = vf and you know
that for n  vf the vectorized loop + it's epilogue path will not be 
taken,
then perhaps you could, but it is a very special case.
Now, the question is when we are guaranteed we enter the scalar 
versioned
loop instead for n  vf, is that in case of versioning for alias or
versioning for alignment?
   
I think neither - I have plans to do the cost model check together
with the versioning condition but didn't get around to implement that.
That would allow stronger max bounds for the epilogue loop.
  
   In vect_transform_loop(), check_profitability will be set to true if
   th = VF-1 and the number of iteration is unknown (we only consider
   unknown trip count here), where th is calculated based on the
   parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
   value VF-1. If the loop needs to be versioned, then
   check_profitability with true value will be passed to
   vect_loop_versioning(), in which an enhanced loop bound check
   (considering cost) will be built. So I think if the loop is versioned
   and n  VF, then we must enter the scalar version, and in this case
   removing epilogue should be safe when the maximum trip count = th+1.
  
   You mean exactly in the case where the profitability check ensures
   that n % vf == 0?  Thus effectively if n == maximum trip count?
   That's quite a special case, no?
 
 
  Yes, it is a special case. But it is in this special case that those
  warnings are thrown out. Also, I think declaring an array with VF*N as
  length is not unusual.

 Ok, but then for the patch compute the cost model threshold once
 in vect_analyze_loop_2 and store it in a new
 LOOP_VINFO_COST_MODEL_THRESHOLD.


Done.


 Also you have to check
 the return value from max_stmt_executions_int as that may return
 -1 if the number cannot be computed (or isn't representable in
 a HOST_WIDE_INT).


It will be converted to unsigned type so that -1 means infinity.


 You also should check for
 LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the
 same effect on the cost model check.


Done.




 The existing condition is already complicated enough - adding new
 stuff warrants comments before the (sub-)checks.


OK. Comments added.

Below is the revised patch. Bootstrapped and tested on a x86-64 machine.


Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e1d8666..eceefb3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2014-03-11  Cong Hou  co...@google.com
+
+ PR tree-optimization/60505
+ * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the
+ threshold of number of iterations below which no vectorization will be
+ done.
+ * tree-vect-loop.c (new_loop_vec_info):
+ Initialize LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_analyze_loop_operations):
+ Set LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_transform_loop):
+ Use LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number
+ of iterations of the loop and see if we should build the epilogue.
+
 2014-03-10  Jakub Jelinek  ja...@redhat.com

  PR ipa/60457
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 41b6875..09ec1c0 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-03-11  Cong Hou  co...@google.com
+
+ PR tree-optimization/60505
+ * gcc.dg/vect/pr60505.c: New test.
+
 2014-03-10  Jakub Jelinek  ja...@redhat.com

  PR ipa/60457
diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c
b/gcc/testsuite/gcc.dg/vect/pr60505.c
new file mode 100644
index 000..6940513
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr60505.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options -Wall -Werror } */
+
+void foo(char *in, char *out, int num)
+{
+  int i;
+  char ovec[16] = {0};
+
+  for(i = 0; i  num ; ++i)
+out[i] = (ovec[i] = in[i]);
+  out[num] = ovec[num/2];
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index df6ab6f..1c78e11 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -933,6 +933,7 @@ new_loop_vec_info (struct loop *loop)
   LOOP_VINFO_NITERS (res) = NULL;

[4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt

Hi,

Support for Power8 features and the new powerpc64le-linux-gnu target,
including the ELFv2 ABI, has been developed up till now on the
ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
while the support was unstable, but this branch will not represent a
particularly good support mechanism for distributions going forward.
Most distros are set up to pull from the major release branches, and
having a separate branch for one target is quite inconvenient.  Also,
the ibm/gcc-4_8-branch's original purpose is to serve as the code base
for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
branch currently serves will diverge and make things even more
complicated.

The code is now tested and stable enough that we are ready to backport
this support to the FSF 4.8 branch.  This patch series constitutes that
backport.

Almost all of the changes are specific to PowerPC portions of the code,
and for those patches I am only CCing David.  However, some of the
patches require changes to common code, and for these I will CC Richard
and Jakub.  Three of these are slightly unrelated but necessary patches,
one to enable decimal float ABS builtins, and two others to fix PR54537
and PR56843.  In addition there are patches that update configuration
files throughout for the new target, and some small changes in common
call support (call.c, expr.h, function.c) to support how the new ABI
handles calls.

I realize it is unusual to backport such a large amount of code, but we
have been asked by distribution partners to do this, and we feel it
makes good sense for long-term support.

I have tested the patch series by applying it to a clean FSF 4.8 branch
and comparing the test results against those from the IBM 4.8 branch on
three systems:
 * Power8, little endian (--mcpu=power8)
 * Power8, big endian (--mcpu=power8)
 * Power7, big endian (--mcpu=power7)

I also checked a recursive diff against the two source directories to
ensure that no patches were missed.

Thanks,
Bill

[ 1/26] diff-p8
[ 2/26] diff-p8-htm
[ 3/26] diff-le-config
[ 4/26] diff-le-libtool
[ 5/26] diff-le-tests
[ 6/26] diff-le-dfp
[ 7/26] diff-le-vector
[ 8/26] diff-abi-compat
[ 9/26] diff-abi-calls
[10/26] diff-abi-elfv2
[11/26] diff-abi-gotest
[12/26] diff-le-align
[13/26] diff-abi-libffi
[14/26] diff-dfp-abs
[15/26] diff-pr54537
[16/26] diff-pr56843
[17/26] diff-direct-move
[18/26] diff-le-config-2
[19/26] diff-quad-memory
[20/26] diff-lra
[21/26] diff-le-vector-api
[22/26] diff-mcall
[23/26] diff-pr60137-pr60203
[24/26] diff-reload
[25/26] diff-v1ti
[26/26] diff-trunk-missing

[4.8, PATCH 3/26] Backport Power8 and LE support: Configury bits 1

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-config) backports updates to more recent
config.guess and config.sub versions to support the new powerpc64le
target.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r203071:

2013-10-01  Joern Rennecke  joern.renne...@embecosm.com

Import from savannah.gnu.org:
* config.guess: Update to 2013-06-10 version.
* config.sub: Update to 2013-10-01 version.


Index: gcc-4_8-branch/config.guess
===
--- gcc-4_8-branch.orig/config.guess2013-12-28 17:41:32.765630566 +0100
+++ gcc-4_8-branch/config.guess 2013-12-28 17:50:37.995329461 +0100
@@ -1,10 +1,8 @@
 #! /bin/sh
 # Attempt to guess a canonical system name.
-#   Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
-#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
-#   2011, 2012, 2013 Free Software Foundation, Inc.
+#   Copyright 1992-2013 Free Software Foundation, Inc.
 
-timestamp='2012-12-30'
+timestamp='2013-06-10'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -52,9 +50,7 @@ version=\
 GNU config.guess ($timestamp)
 
 Originally written by Per Bothner.
-Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
-2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
-2012, 2013 Free Software Foundation, Inc.
+Copyright 1992-2013 Free Software Foundation, Inc.
 
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
@@ -136,6 +132,27 @@ UNAME_RELEASE=`(uname -r) 2/dev/null` |
 UNAME_SYSTEM=`(uname -s) 2/dev/null`  || UNAME_SYSTEM=unknown
 UNAME_VERSION=`(uname -v) 2/dev/null` || UNAME_VERSION=unknown
 
+case ${UNAME_SYSTEM} in
+Linux|GNU|GNU/*)
+   # If the system lacks a compiler, then just pick glibc.
+   # We could probably try harder.
+   LIBC=gnu
+
+   eval $set_cc_for_build
+   cat -EOF  $dummy.c
+   #include features.h
+   #if defined(__UCLIBC__)
+   LIBC=uclibc
+   #elif defined(__dietlibc__)
+   LIBC=dietlibc
+   #else
+   LIBC=gnu
+   #endif
+   EOF
+   eval `$CC_FOR_BUILD -E $dummy.c 2/dev/null | grep '^LIBC'`
+   ;;
+esac
+
 # Note: order is significant - the case branches are not exclusive.
 
 case ${UNAME_MACHINE}:${UNAME_SYSTEM}:${UNAME_RELEASE}:${UNAME_VERSION} in
@@ -857,21 +874,21 @@ EOF
exit ;;
 *:GNU:*:*)
# the GNU system
-   echo `echo ${UNAME_MACHINE}|sed -e 's,[-/].*$,,'`-unknown-gnu`echo 
${UNAME_RELEASE}|sed -e 's,/.*$,,'`
+   echo `echo ${UNAME_MACHINE}|sed -e 's,[-/].*$,,'`-unknown-${LIBC}`echo 
${UNAME_RELEASE}|sed -e 's,/.*$,,'`
exit ;;
 *:GNU/*:*:*)
# other systems with GNU libc and userland
-   echo ${UNAME_MACHINE}-unknown-`echo ${UNAME_SYSTEM} | sed 's,^[^/]*/,,' 
| tr '[A-Z]' '[a-z]'``echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'`-gnu
+   echo ${UNAME_MACHINE}-unknown-`echo ${UNAME_SYSTEM} | sed 's,^[^/]*/,,' 
| tr '[A-Z]' '[a-z]'``echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'`-${LIBC}
exit ;;
 i*86:Minix:*:*)
echo ${UNAME_MACHINE}-pc-minix
exit ;;
 aarch64:Linux:*:*)
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 aarch64_be:Linux:*:*)
UNAME_MACHINE=aarch64_be
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 alpha:Linux:*:*)
case `sed -n '/^cpu model/s/^.*: \(.*\)/\1/p'  /proc/cpuinfo` in
@@ -884,59 +901,54 @@ EOF
  EV68*) UNAME_MACHINE=alphaev68 ;;
esac
objdump --private-headers /bin/sh | grep -q ld.so.1
-   if test $? = 0 ; then LIBC=libc1 ; else LIBC= ; fi
-   echo ${UNAME_MACHINE}-unknown-linux-gnu${LIBC}
+   if test $? = 0 ; then LIBC=gnulibc1 ; fi
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
+   exit ;;
+arc:Linux:*:* | arceb:Linux:*:*)
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 arm*:Linux:*:*)
eval $set_cc_for_build
if echo __ARM_EABI__ | $CC_FOR_BUILD -E - 2/dev/null \
| grep -q __ARM_EABI__
then
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
else
if echo __ARM_PCS_VFP | $CC_FOR_BUILD -E - 2/dev/null \
| grep -q __ARM_PCS_VFP
then
-   echo ${UNAME_MACHINE}-unknown-linux-gnueabi
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}eabi
else
-   echo ${UNAME_MACHINE}-unknown-linux-gnueabihf
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}eabihf
fi
fi
exit ;;
 avr32*:Linux:*:*)
-   echo

[4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-tests) backports adjustments to a few tests for
powerpc64le and the ELFv2 ABI.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-11-27  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gfortran.dg/nan_7.f90: Disable for little endian PowerPC.

Backport from mainline r205106:

2013-11-20  Ulrich Weigand  ulrich.weig...@de.ibm.com

* gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe.

Backport from mainline r205046:

2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

* gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to
construct parameter slot value in endian-independent way.
(fcevv, fciievv, fcvevv): Use it.


Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c   
2013-12-28 17:50:39.655337721 +0100
@@ -119,6 +119,12 @@ typedef union
   vector int v;
 } vector_int_t;

+#ifdef __LITTLE_ENDIAN__
+#define MAKE_SLOT(x, y) ((long)x | ((long)y  32))
+#else
+#define MAKE_SLOT(x, y) ((long)y | ((long)x  32))
+#endif
+
 /* Paramter passing.
s : gpr 3
v : vpr 2
@@ -226,8 +232,8 @@ fcevv (char *s, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[2].l != 0x10002ULL
-  || sp-slot[4].l != 0x50006ULL)
+  if (sp-slot[2].l != MAKE_SLOT (1, 2)
+  || sp-slot[4].l !=  MAKE_SLOT (5, 6))
 abort();
 }

@@ -268,8 +274,8 @@ fciievv (char *s, int i, int j, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[4].l != 0x10002ULL
-  || sp-slot[6].l != 0x50006ULL)
+  if (sp-slot[4].l != MAKE_SLOT (1, 2)
+  || sp-slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }

@@ -296,8 +302,8 @@ fcvevv (char *s, vector int x, ...)
   sp = __builtin_frame_address(0);
   sp = sp-backchain;
   
-  if (sp-slot[4].l != 0x10002ULL
-  || sp-slot[6].l != 0x50006ULL)
+  if (sp-slot[4].l != MAKE_SLOT (1, 2)
+  || sp-slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }

Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c   
2013-12-28 17:50:39.659337741 +0100
@@ -11,7 +11,11 @@ int  msw(long long in)
 int  i[2];
   } ud;
   ud.ll = in;
+#ifdef __LITTLE_ENDIAN__
+  return ud.i[1];
+#else
   return ud.i[0];
+#endif
 }

 int main()
Index: gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90
===
--- gcc-4_8-branch.orig/gcc/testsuite/gfortran.dg/nan_7.f90 2013-12-28 
17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90  2013-12-28 
17:50:39.662337756 +0100
@@ -2,6 +2,7 @@
 ! { dg-options -fno-range-check }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-require-effective-target fortran_integer_16 }
+! { dg-skip-if  { powerpc*le-*-* } { * } {  } }
 ! PR47293 NAN not correctly read
 character(len=200) :: str
 real(16) :: r

[4.8, PATCH 8/26] Backport Power8 and LE support: PR57949

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-abi-compat) backports the ABI compatibility fix for
PR57949.

Thanks,
Bill


[gcc]

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r201750.
2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR target/57949
* doc/invoke.texi: Add documentation of mcompat-align-parm
option.
* config/rs6000/rs6000.opt: Add mcompat-align-parm option.
* config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX
and Linux, correct BLKmode alignment when 128-bit alignment is
required and compatibility flag is not set.
(rs6000_gimplify_va_arg): For AIX and Linux, honor specified
alignment for zero-size arguments when compatibility flag is not
set.

[gcc/testsuite]

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r201750.
2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR target/57949
* gcc.target/powerpc/pr57949-1.c: New.
* gcc.target/powerpc/pr57949-2.c: New.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -8680,8 +8680,8 @@ rs6000_function_arg_boundary (enum machi
   || (type  TREE_CODE (type) == VECTOR_TYPE
int_size_in_bytes (type) = 16))
 return 128;
-  else if (TARGET_MACHO
-   rs6000_darwin64_abi
+  else if (((TARGET_MACHO  rs6000_darwin64_abi)
+|| (DEFAULT_ABI == ABI_AIX  !rs6000_compat_align_parm))
mode == BLKmode
type  TYPE_ALIGN (type)  64)
 return 128;
@@ -10233,8 +10233,9 @@ rs6000_gimplify_va_arg (tree valist, tre
  We don't need to check for pass-by-reference because of the test above.
  We can return a simplifed answer, since we know there's no offset to add. 
 */
 
-  if (TARGET_MACHO
-   rs6000_darwin64_abi 
+  if (((TARGET_MACHO
+ rs6000_darwin64_abi)
+   || (DEFAULT_ABI == ABI_AIX  !rs6000_compat_align_parm))
integer_zerop (TYPE_SIZE (type)))
 {
   unsigned HOST_WIDE_INT align, boundary;
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.opt
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.opt
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.opt
@@ -550,6 +550,10 @@ mquad-memory
 Target Report Mask(QUAD_MEMORY) Var(rs6000_isa_flags)
 Generate the quad word memory instructions (lq/stq/lqarx/stqcx).
 
+mcompat-align-parm
+Target Report Var(rs6000_compat_align_parm) Init(1) Save
+Generate aggregate parameter passing code with at most 64-bit alignment.
+
 mupper-regs-df
 Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
 Allow double variables in upper registers with -mcpu=power7 or -mvsx
Index: gcc-4_8-test/gcc/doc/invoke.texi
===
--- gcc-4_8-test.orig/gcc/doc/invoke.texi
+++ gcc-4_8-test/gcc/doc/invoke.texi
@@ -17243,7 +17243,8 @@ following options:
 -mpopcntb -mpopcntd  -mpowerpc64 @gol
 -mpowerpc-gpopt  -mpowerpc-gfxopt  -msingle-float -mdouble-float @gol
 -msimple-fpu -mstring  -mmulhw  -mdlmzb  -mmfpgpr -mvsx @gol
--mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory}
+-mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory @gol
+-mcompat-align-parm -mno-compat-align-parm}
 
 The particular options set for any particular CPU varies between
 compiler versions, depending on what setting seems to produce optimal
@@ -18128,6 +18129,23 @@ stack location in the function prologue
 a pointer on AIX and 64-bit Linux systems.  If the TOC value is not
 saved in the prologue, it is saved just before the call through the
 pointer.  The @option{-mno-save-toc-indirect} option is the default.
+
+@item -mcompat-align-parm
+@itemx -mno-compat-align-parm
+@opindex mcompat-align-parm
+Generate (do not generate) code to pass structure parameters with a
+maximum alignment of 64 bits, for compatibility with older versions
+of GCC.
+
+Older versions of GCC (prior to 4.9.0) incorrectly did not align a
+structure parameter on a 128-bit boundary when that structure contained
+a member requiring 128-bit alignment.  This is corrected in more
+recent versions of GCC.  This option may be used to generate code
+that is compatible with functions compiled with older versions of
+GCC.
+
+In this version of the compiler, the @option{-mcompat-align-parm}
+is the default, except when using the Linux ELFv2 ABI.
 @end table
 
 @node RX Options
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr57949-1.c

[4.8, PATCH 9/26] Backport Power8 and LE support: ABI call support

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-abi-calls) backports fixes to common code to support
the new ELFv2 ABI.  Copying Richard and Jakub for these bits.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r204798:

2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
Alan Modra  amo...@gmail.com

* function.c (assign_parms): Use all.reg_parm_stack_space instead
of re-evaluating REG_PARM_STACK_SPACE target macro.
(locate_and_pad_parm): New parameter REG_PARM_STACK_SPACE.  Use it
instead of evaluating target macro REG_PARM_STACK_SPACE every time.
(assign_parm_find_entry_rtl): Update call.
* calls.c (initialize_argument_information): Update call.
(emit_library_call_value_1): Likewise.
* expr.h (locate_and_pad_parm): Update prototype.

Backport from mainline r204797:

2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com

* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
arguments.

Backport from mainline r197003:

2013-03-23  Eric Botcazou  ebotca...@adacore.com

* calls.c (expand_call): Add missing guard to code handling return
of non-BLKmode structures in MSB.
* function.c (expand_function_end): Likewise.


Index: gcc-4_8-branch/gcc/calls.c
===
--- gcc-4_8-branch.orig/gcc/calls.c 2013-12-28 17:41:32.056627059 +0100
+++ gcc-4_8-branch/gcc/calls.c  2013-12-28 17:50:43.356356135 +0100
@@ -983,6 +983,7 @@ store_unaligned_arguments_into_pseudos (
 
   for (i = 0; i  num_actuals; i++)
 if (args[i].reg != 0  ! args[i].pass_on_stack
+GET_CODE (args[i].reg) != PARALLEL
 args[i].mode == BLKmode
 MEM_P (args[i].value)
 (MEM_ALIGN (args[i].value)
@@ -1327,6 +1328,7 @@ initialize_argument_information (int num
 #else
 args[i].reg != 0,
 #endif
+reg_parm_stack_space,
 args[i].pass_on_stack ? 0 : args[i].partial,
 fndecl, args_size, args[i].locate);
 #ifdef BLOCK_REG_PADDING
@@ -3171,7 +3173,9 @@ expand_call (tree exp, rtx target, int i
 group load/store machinery below.  */
   if (!structure_value_addr
   !pcc_struct_value
+  TYPE_MODE (rettype) != VOIDmode
   TYPE_MODE (rettype) != BLKmode
+  REG_P (valreg)
   targetm.calls.return_in_msb (rettype))
{
  if (shift_return_value (TYPE_MODE (rettype), false, valreg))
@@ -3734,7 +3738,8 @@ emit_library_call_value_1 (int retval, r
 #else
   argvec[count].reg != 0,
 #endif
-  0, NULL_TREE, args_size, argvec[count].locate);
+  reg_parm_stack_space, 0,
+  NULL_TREE, args_size, argvec[count].locate);
 
   if (argvec[count].reg == 0 || argvec[count].partial != 0
  || reg_parm_stack_space  0)
@@ -3821,7 +3826,7 @@ emit_library_call_value_1 (int retval, r
 #else
   argvec[count].reg != 0,
 #endif
-  argvec[count].partial,
+  reg_parm_stack_space, argvec[count].partial,
   NULL_TREE, args_size, argvec[count].locate);
  args_size.constant += argvec[count].locate.size.constant;
  gcc_assert (!argvec[count].locate.size.var);
Index: gcc-4_8-branch/gcc/function.c
===
--- gcc-4_8-branch.orig/gcc/function.c  2013-12-28 17:41:32.056627059 +0100
+++ gcc-4_8-branch/gcc/function.c   2013-12-28 17:50:43.362356165 +0100
@@ -2507,6 +2507,7 @@ assign_parm_find_entry_rtl (struct assig
 }
 
   locate_and_pad_parm (data-promoted_mode, data-passed_type, in_regs,
+  all-reg_parm_stack_space,
   entry_parm ? data-partial : 0, current_function_decl,
   all-stack_args_size, data-locate);
 
@@ -3485,11 +3486,7 @@ assign_parms (tree fndecl)
   /* Adjust function incoming argument size for alignment and
  minimum length.  */
 
-#ifdef REG_PARM_STACK_SPACE
-  crtl-args.size = MAX (crtl-args.size,
-   REG_PARM_STACK_SPACE (fndecl));
-#endif
-
+  crtl-args.size = MAX (crtl-args.size, all.reg_parm_stack_space);
   crtl-args.size = CEIL_ROUND (crtl-args.size,
   PARM_BOUNDARY / BITS_PER_UNIT);
 
@@ -3693,6 +3690,9 @@ gimplify_parameters (void)
IN_REGS is nonzero if the argument will be passed in registers.  It will
never be set if REG_PARM_STACK_SPACE is not defined.
 
+   REG_PARM_STACK_SPACE is the number of bytes of stack space reserved
+   for arguments which are passed in registers.
+
FNDECL is the function in which the argument was defined.
 
There are two types of

Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread David Malcolm

On Wed, 2014-03-19 at 12:10 -0600, Tom Tromey wrote:
  Tom == Tom Tromey tro...@redhat.com writes:
 
 Tom This patch introduces a new class toplev and changes toplev_main and
 Tom toplev_finalize to be methods of this class.  Additionally, now the
 Tom timevars are automatically stopped when the object is destroyed.  This
 Tom cleans up compile a bit and makes it simpler to reuse the toplev
 Tom logic in other code.
 
 David asked me off-list to rename the field in class toplev, so here's a
 new patch that does this.

Thanks!  (yes, I greatly prefer having member data of a class to have a
m_ prefix, and for the ctor params to have equivalent names, without
the prefix, which this patch does, for toplev).

 Tom
 
 commit 66f92863ef55c26f673d02dd39027f340940a3bf
 Author: Tom Tromey tro...@redhat.com
 Date:   Tue Mar 18 08:07:40 2014 -0600
 
 introduce class toplev
 
 This patch introduces a new class toplev and changes toplev_main and
 toplev_finalize to be methods of this class.  Additionally, now the
 timevars are automatically stopped when the object is destroyed.  This
 cleans up compile a bit and makes it simpler to reuse the toplev
 logic in other code.

OK.  Are you able to push this to my branch, or do you need me to do
this?

[4.8, PATCH 11/26] Backport Power8 and LE support: gotest

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-abi-gotest) backports enablement of the Go testsuite
for powerpc64le.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r205000.
2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

gotest: Recognize PPC ELF v2 function pointers in text section.


Index: gcc-4_8-branch/libgo/testsuite/gotest
===
--- gcc-4_8-branch.orig/libgo/testsuite/gotest  2013-12-28 17:41:31.783625708 
+0100
+++ gcc-4_8-branch/libgo/testsuite/gotest   2013-12-28 17:50:45.671367653 
+0100
@@ -369,7 +369,7 @@ localname() {
 {
text=T
case $GOARCH in
-   ppc64) text=D ;;
+   ppc64) text=[TD] ;;
esac
 
symtogo='sed -e s/_test/XXXtest/ -e s/.*_\([^_]*\.\)/\1/ -e 
s/XXXtest/_test/'

[4.8, PATCH 12/26] Backport Power8 and LE support: Defaults

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-align) sets some miscellaneous defaults for little
endian support.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Apply mainline r205060.
2013-11-20  Alan Modra  amo...@gmail.com
* config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Default
to strict alignment on older processors when little-endian.
* config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8
for ELFv2.


Index: gcc-4_8-branch/gcc/config/rs6000/linux64.h
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/linux64.h 2013-12-28 
17:50:44.252360594 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/linux64.h  2013-12-28 17:50:46.356371060 
+0100
@@ -71,7 +71,11 @@ extern int dot_symbols;
 #undef  PROCESSOR_DEFAULT
 #define PROCESSOR_DEFAULT PROCESSOR_POWER7
 #undef  PROCESSOR_DEFAULT64
+#ifdef LINUX64_DEFAULT_ABI_ELFv2
+#define PROCESSOR_DEFAULT64 PROCESSOR_POWER8
+#else
 #define PROCESSOR_DEFAULT64 PROCESSOR_POWER7
+#endif
 
 /* We don't need to generate entries in .fixup, except when
-mrelocatable or -mrelocatable-lib is given.  */
Index: gcc-4_8-branch/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/rs6000.c  2013-12-28 
17:50:44.219360429 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/rs6000.c   2013-12-28 17:50:46.369371125 
+0100
@@ -3206,6 +3206,12 @@ rs6000_option_override_internal (bool gl
}
 }
 
+  /* If little-endian, default to -mstrict-align on older processors.
+ Testing for htm matches power8 and later.  */
+  if (!BYTES_BIG_ENDIAN
+   !(processor_target_table[tune_index].target_enable  OPTION_MASK_HTM))
+rs6000_isa_flags |= ~rs6000_isa_flags_explicit  OPTION_MASK_STRICT_ALIGN;
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
Index: gcc-4_8-branch/gcc/config/rs6000/sysv4.h
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/sysv4.h   2013-12-28 
17:50:44.243360549 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/sysv4.h2013-12-28 17:50:46.374371150 
+0100
@@ -538,12 +538,7 @@ ENDIAN_SELECT( -mbig,  -mlittle, DEF
 
 #defineCC1_ENDIAN_BIG_SPEC 
 
-#defineCC1_ENDIAN_LITTLE_SPEC \
-%{!mstrict-align: %{!mno-strict-align: \
-%{!mcall-i960-old: \
-   -mstrict-align \
-} \
-}}
+#defineCC1_ENDIAN_LITTLE_SPEC 
 
 #defineCC1_ENDIAN_DEFAULT_SPEC %(cc1_endian_big)

[4.8, PATCH 14/26] Backport Power8 and LE support: DFP absolute value

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-dfp-abs) backports some unrelated but necessary work to
enable the DFP absolute value builtins.  Copying Jakub who was involved
with the original patch.

Thanks,
Bill


2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-08-19  Peter Bergner  berg...@vnet.ibm.com
Jakub Jelinek  ja...@redhat.com

* builtins.def (BUILT_IN_FABSD32): New DFP ABS builtin.
(BUILT_IN_FABSD64): Likewise.
(BUILT_IN_FABSD128): Likewise.
* builtins.c (expand_builtin): Add support for
new DFP ABS builtins.
(fold_builtin_1): Likewise.
* config/rs6000/dfp.md
(*abstd2_fpr): Handle non-overlapping destination
and source operands.
(*nabstd2_fpr): Likewise.

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-08-19  Peter Bergner  berg...@vnet.ibm.com

* gcc.target/powerpc/dfp-dd-2.c: New test.
* gcc.target/powerpc/dfp-td-2.c: Likewise.
* gcc.target/powerpc/dfp-td-3.c: Likewise.


Index: gcc-4_8-test/gcc/builtins.c
===
--- gcc-4_8-test.orig/gcc/builtins.c
+++ gcc-4_8-test/gcc/builtins.c
@@ -5861,6 +5861,9 @@ expand_builtin (tree exp, rtx target, rt
   switch (fcode)
 {
 CASE_FLT_FN (BUILT_IN_FABS):
+case BUILT_IN_FABSD32:
+case BUILT_IN_FABSD64:
+case BUILT_IN_FABSD128:
   target = expand_builtin_fabs (exp, target, subtarget);
   if (target)
return target;
@@ -10313,6 +10316,9 @@ fold_builtin_1 (location_t loc, tree fnd
   return fold_builtin_strlen (loc, type, arg0);
 
 CASE_FLT_FN (BUILT_IN_FABS):
+case BUILT_IN_FABSD32:
+case BUILT_IN_FABSD64:
+case BUILT_IN_FABSD128:
   return fold_builtin_fabs (loc, arg0, type);
 
 case BUILT_IN_ABS:
Index: gcc-4_8-test/gcc/builtins.def
===
--- gcc-4_8-test.orig/gcc/builtins.def
+++ gcc-4_8-test/gcc/builtins.def
@@ -252,6 +252,9 @@ DEF_C99_BUILTIN(BUILT_IN_EXPM1L,
 DEF_LIB_BUILTIN(BUILT_IN_FABS, fabs, BT_FN_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSF, fabsf, BT_FN_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSL, fabsl, BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD32, fabsd32, BT_FN_DFLOAT32_DFLOAT32, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD64, fabsd64, BT_FN_DFLOAT64_DFLOAT64, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD128, fabsd128, 
BT_FN_DFLOAT128_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_FDIM, fdim, BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIMF, fdimf, BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIML, fdiml, 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
Index: gcc-4_8-test/gcc/config/rs6000/dfp.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/dfp.md
+++ gcc-4_8-test/gcc/config/rs6000/dfp.md
@@ -148,18 +148,24 @@
   )
 
 (define_insn *abstd2_fpr
-  [(set (match_operand:TD 0 gpc_reg_operand =d)
-   (abs:TD (match_operand:TD 1 gpc_reg_operand d)))]
+  [(set (match_operand:TD 0 gpc_reg_operand =d,d)
+   (abs:TD (match_operand:TD 1 gpc_reg_operand 0,d)))]
   TARGET_HARD_FLOAT  TARGET_FPRS
-  fabs %0,%1
-  [(set_attr type fp)])
+  @
+   fabs %0,%1
+   fabs %0,%1\;fmr %L0,%L1
+  [(set_attr type fp)
+   (set_attr length 4,8)])
 
 (define_insn *nabstd2_fpr
-  [(set (match_operand:TD 0 gpc_reg_operand =d)
-   (neg:TD (abs:TD (match_operand:TD 1 gpc_reg_operand d]
+  [(set (match_operand:TD 0 gpc_reg_operand =d,d)
+   (neg:TD (abs:TD (match_operand:TD 1 gpc_reg_operand 0,d]
   TARGET_HARD_FLOAT  TARGET_FPRS
-  fnabs %0,%1
-  [(set_attr type fp)])
+  @
+   fnabs %0,%1
+   fnabs %0,%1\;fmr %L0,%L1
+  [(set_attr type fp)
+   (set_attr length 4,8)])
 
 ;; Hardware support for decimal floating point operations.
 
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/dfp-dd-2.c
===
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/dfp-dd-2.c
@@ -0,0 +1,26 @@
+/* Test generation of DFP instructions for POWER6.  */
+/* { dg-do compile { target { powerpc*-*-linux*  powerpc_fprs } } } */
+/* { dg-options -std=gnu99 -O1 -mcpu=power6 } */
+
+/* { dg-final { scan-assembler-times fneg 1 } } */
+/* { dg-final { scan-assembler-times fabs 1 } } */
+/* { dg-final { scan-assembler-times fnabs 1 } } */
+/* { dg-final { scan-assembler-times fmr 0 } } */
+
+_Decimal64
+func1 (_Decimal64 a, _Decimal64 b)
+{
+  return -b;
+}
+
+_Decimal64
+func2 (_Decimal64 a, _Decimal64 b)
+{
+  return

[4.8, PATCH 17/26] Backport Power8 and LE support: Direct moves

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-direct-move) backports support for the Power8 direct
move instructions for little endian.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-10-23  Pat Haugen  pthau...@us.ibm.com

* gcc.target/powerpc/direct-move.h: Fix header for executable tests.

Back port from mainline
2014-01-16  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/59844
* config/rs6000/rs6000.md (reload_vsx_from_gprsf): Add little
endian support, remove tests for WORDS_BIG_ENDIAN.
(p8_mfvsrd_3_mode): Likewise.
(reload_gpr_from_vsxmode): Likewise.
(reload_gpr_from_vsxsf): Likewise.
(p8_mfvsrd_4_disf): Likewise.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.md
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.md
@@ -9438,7 +9438,7 @@
(unspec:SF [(match_operand:SF 1 register_operand r)]
   UNSPEC_P8V_RELOAD_FROM_GPR))
(clobber (match_operand:DI 2 register_operand =r))]
-  TARGET_POWERPC64  TARGET_DIRECT_MOVE  WORDS_BIG_ENDIAN
+  TARGET_POWERPC64  TARGET_DIRECT_MOVE
   #
reload_completed
   [(const_int 0)]
@@ -9465,7 +9465,7 @@
   [(set (match_operand:DF 0 register_operand =r)
(unspec:DF [(match_operand:FMOVE128_GPR 1 register_operand wa)]
   UNSPEC_P8V_RELOAD_FROM_VSX))]
-  TARGET_POWERPC64  TARGET_DIRECT_MOVE  WORDS_BIG_ENDIAN
+  TARGET_POWERPC64  TARGET_DIRECT_MOVE
   mfvsrd %0,%x1
   [(set_attr type mftgpr)])
 
@@ -9475,7 +9475,7 @@
 [(match_operand:FMOVE128_GPR 1 register_operand wa)]
 UNSPEC_P8V_RELOAD_FROM_VSX))
(clobber (match_operand:FMOVE128_GPR 2 register_operand =wa))]
-  TARGET_POWERPC64  TARGET_DIRECT_MOVE  WORDS_BIG_ENDIAN
+  TARGET_POWERPC64  TARGET_DIRECT_MOVE
   #
reload_completed
   [(const_int 0)]
@@ -9502,7 +9502,7 @@
(unspec:SF [(match_operand:SF 1 register_operand wa)]
   UNSPEC_P8V_RELOAD_FROM_VSX))
(clobber (match_operand:V4SF 2 register_operand =wa))]
-  TARGET_POWERPC64  TARGET_DIRECT_MOVE  WORDS_BIG_ENDIAN
+  TARGET_POWERPC64  TARGET_DIRECT_MOVE
   #
reload_completed
   [(const_int 0)]
@@ -9524,7 +9524,7 @@
   [(set (match_operand:DI 0 register_operand =r)
(unspec:DI [(match_operand:V4SF 1 register_operand wa)]
   UNSPEC_P8V_RELOAD_FROM_VSX))]
-  TARGET_POWERPC64  TARGET_DIRECT_MOVE  WORDS_BIG_ENDIAN
+  TARGET_POWERPC64  TARGET_DIRECT_MOVE
   mfvsrd %0,%x1
   [(set_attr type mftgpr)])
 
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/direct-move.h
===
--- gcc-4_8-test.orig/gcc/testsuite/gcc.target/powerpc/direct-move.h
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/direct-move.h
@@ -1,5 +1,7 @@
 /* Test functions for direct move support.  */
 
+#include math.h
+extern void abort (void);
 
 #ifndef VSX_REG_ATTR
 #define VSX_REG_ATTR wa
@@ -111,7 +113,7 @@ const struct test_struct test_functions[
 void __attribute__((__noinline__))
 test_value (TYPE a)
 {
-  size_t i;
+  long i;
 
   for (i = 0; i  sizeof (test_functions) / sizeof (test_functions[0]); i++)
 {
@@ -127,8 +129,7 @@ test_value (TYPE a)
 int
 main (void)
 {
-  size_t i;
-  long j;
+  long i,j;
   union {
 TYPE value;
 unsigned char bytes[sizeof (TYPE)];

[4.8, PATCH 6/26] Backport Power8 and LE support: TDmode for LE

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-dfp) backports fixes for TDmode on a little endian
target.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r205123:

2013-11-20  Ulrich Weigand  ulrich.weig...@de.ibm.com

* config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not
allow subregs of TDmode in FPRs of smaller size in little-endian.
(rs6000_split_multireg_move): When splitting an access to TDmode
in FPRs, do not use simplify_gen_subreg.

Backport from mainline r204927:

2013-11-17  Ulrich Weigand  ulrich.weig...@de.ibm.com

* config/rs6000/rs6000.c (rs6000_emit_move): Use low word of
sdmode_stack_slot also in little-endian mode.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -7963,7 +7963,9 @@ rs6000_emit_move (rtx dest, rtx source,
}
   else if (INT_REGNO_P (REGNO (operands[1])))
{
- rtx mem = adjust_address_nv (operands[0], mode, 4);
+ rtx mem = operands[0];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (mem, operands[1]));
}
@@ -7986,7 +7988,9 @@ rs6000_emit_move (rtx dest, rtx source,
}
   else if (INT_REGNO_P (REGNO (operands[0])))
{
- rtx mem = adjust_address_nv (operands[1], mode, 4);
+ rtx mem = operands[1];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (operands[0], mem));
}
@@ -16082,6 +16086,13 @@ rs6000_cannot_change_mode_class (enum ma
  if (TARGET_IEEEQUAD  (to == TFmode || from == TFmode))
return true;
 
+ /* TDmode in floating-mode registers must always go into a register
+pair with the most significant word in the even-numbered register
+to match ISA requirements.  In little-endian mode, this does not
+match subreg numbering, so we cannot allow subregs.  */
+ if (!BYTES_BIG_ENDIAN  (to == TDmode || from == TDmode))
+   return true;
+
  if (from_size  8 || to_size  8)
return true;
 
@@ -19028,6 +19039,39 @@ rs6000_split_multireg_move (rtx dst, rtx
 
   gcc_assert (reg_mode_size * nregs == GET_MODE_SIZE (mode));
 
+  /* TDmode residing in FP registers is special, since the ISA requires that
+ the lower-numbered word of a register pair is always the most significant
+ word, even in little-endian mode.  This does not match the usual subreg
+ semantics, so we cannnot use simplify_gen_subreg in those cases.  Access
+ the appropriate constituent registers by hand in little-endian mode.
+
+ Note we do not need to check for destructive overlap here since TDmode
+ can only reside in even/odd register pairs.  */
+  if (FP_REGNO_P (reg)  DECIMAL_FLOAT_MODE_P (mode)  !BYTES_BIG_ENDIAN)
+{
+  rtx p_src, p_dst;
+  int i;
+
+  for (i = 0; i  nregs; i++)
+   {
+ if (REG_P (src)  FP_REGNO_P (REGNO (src)))
+   p_src = gen_rtx_REG (reg_mode, REGNO (src) + nregs - 1 - i);
+ else
+   p_src = simplify_gen_subreg (reg_mode, src, mode,
+i * reg_mode_size);
+
+ if (REG_P (dst)  FP_REGNO_P (REGNO (dst)))
+   p_dst = gen_rtx_REG (reg_mode, REGNO (dst) + nregs - 1 - i);
+ else
+   p_dst = simplify_gen_subreg (reg_mode, dst, mode,
+i * reg_mode_size);
+
+ emit_insn (gen_rtx_SET (VOIDmode, p_dst, p_src));
+   }
+
+  return;
+}
+
   if (REG_P (src)  REG_P (dst)  (REGNO (src)  REGNO (dst)))
 {
   /* Move register range backwards, if we might have destructive

[4.8, PATCH 4/26] Backport Power8 and LE support: Libtool and configure bits 2

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-libtool) backports changes to use a libtool.m4 that
supports powerpc64le-*linux*.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-11-22  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-17  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libtool.m4: Update to mainline version.
* libjava/libltdl/acinclude.m4: Likewise.

* gcc/configure: Regenerate.
* boehm-gc/configure: Regenerate.
* libatomic/configure: Regenerate.
* libbacktrace/configure: Regenerate.
* libffi/configure: Regenerate.
* libgfortran/configure: Regenerate.
* libgomp/configure: Regenerate.
* libitm/configure: Regenerate.
* libjava/configure: Regenerate.
* libjava/libltdl/configure: Regenerate.
* libjava/classpath/configure: Regenerate.
* libmudflap/configure: Regenerate.
* libobjc/configure: Regenerate.
* libquadmath/configure: Regenerate.
* libsanitizer/configure: Regenerate.
* libssp/configure: Regenerate.
* libstdc++-v3/configure: Regenerate.
* lto-plugin/configure: Regenerate.
* zlib/configure: Regenerate.

Backport from mainline
2013-09-20  Alan Modra  amo...@gmail.com

* libtool.m4 (_LT_ENABLE_LOCK ld -m flags): Remove non-canonical
ppc host match.  Support little-endian powerpc linux hosts.
* configure: Regenerate.


Index: gcc-4_8-branch/gcc/configure
===
--- gcc-4_8-branch.orig/gcc/configure   2013-12-28 17:41:32.733630408 +0100
+++ gcc-4_8-branch/gcc/configure2013-12-28 17:50:38.646332701 +0100
@@ -13589,7 +13589,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*kfreebsd*-gnu|x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*| \
+x86_64-*kfreebsd*-gnu|x86_64-*linux*|powerpc*-*linux*| \
 s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;'  conftest.$ac_ext
@@ -13614,7 +13614,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
;;
esac
;;
- ppc64-*linux*|powerpc64-*linux*)
+ powerpc64le-*linux*)
+   LD=${LD-ld} -m elf32lppclinux
+   ;;
+ powerpc64-*linux*)
LD=${LD-ld} -m elf32ppclinux
;;
  s390x-*linux*)
@@ -13633,7 +13636,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
  x86_64-*linux*)
LD=${LD-ld} -m elf_x86_64
;;
- ppc*-*linux*|powerpc*-*linux*)
+ powerpcle-*linux*)
+   LD=${LD-ld} -m elf64lppc
+   ;;
+ powerpc-*linux*)
LD=${LD-ld} -m elf64ppc
;;
  s390*-*linux*|s390*-*tpf*)
@@ -17827,7 +17833,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 17830 configure
+#line 17836 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -17933,7 +17939,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 17936 configure
+#line 17942 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
Index: gcc-4_8-branch/libtool.m4
===
--- gcc-4_8-branch.orig/libtool.m4  2013-12-28 17:41:32.728630383 +0100
+++ gcc-4_8-branch/libtool.m4   2013-12-28 17:50:38.652332731 +0100
@@ -1220,7 +1220,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*kfreebsd*-gnu|x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*| \
+x86_64-*kfreebsd*-gnu|x86_64-*linux*|powerpc*-*linux*| \
 s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;'  conftest.$ac_ext
@@ -1241,7 +1241,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
;;
esac
;;
- ppc64-*linux*|powerpc64-*linux*)
+ powerpc64le-*linux*)
+   LD=${LD-ld} -m elf32lppclinux
+   ;;
+ powerpc64-*linux*)
LD=${LD-ld} -m elf32ppclinux
;;
  s390x-*linux*)
@@ -1260,7 +1263,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
  x86_64-*linux*)
LD=${LD-ld} -m elf_x86_64
;;
- ppc*-*linux*|powerpc*-*linux*)
+ powerpcle-*linux*)
+   LD=${LD-ld} -m elf64lppc
+   ;;
+ powerpc-*linux*)
LD=${LD-ld} -m elf64ppc
;;
  s390*-*linux*|s390*-*tpf*)
Index: gcc-4_8-branch/boehm-gc/configure
===
--- gcc-4_8-branch.orig/boehm-gc/configure

[4.8, PATCH 16/26] Backport Power8 and LE support: PR56843

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-pr56843) backports the fix for PR56843.

Thanks,
Bill


[gcc]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR target/56843
* config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove.
(rs6000_emit_swdiv_low_precision): Remove.
(rs6000_emit_swdiv): Rewrite to handle between one and four
iterations of Newton-Raphson generally; modify required number of
iterations for some cases.
* config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove.

[gcc/testsuite]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR target/56843
* gcc.target/powerpc/recip-1.c: Modify expected output.
* gcc.target/powerpc/recip-3.c: Likewise.
* gcc.target/powerpc/recip-4.c: Likewise.
* gcc.target/powerpc/recip-5.c: Add expected output for iterations.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -29417,54 +29417,26 @@ rs6000_emit_nmsub (rtx dst, rtx m1, rtx
   emit_insn (gen_rtx_SET (VOIDmode, dst, r));
 }
 
-/* Newton-Raphson approximation of floating point divide with just 2 passes
-   (either single precision floating point, or newer machines with higher
-   accuracy estimates).  Support both scalar and vector divide.  Assumes no
-   trapping math and finite arguments.  */
+/* Newton-Raphson approximation of floating point divide DST = N/D.  If NOTE_P,
+   add a reg_note saying that this was a division.  Support both scalar and
+   vector divide.  Assumes no trapping math and finite arguments.  */
 
-static void
-rs6000_emit_swdiv_high_precision (rtx dst, rtx n, rtx d)
+void
+rs6000_emit_swdiv (rtx dst, rtx n, rtx d, bool note_p)
 {
   enum machine_mode mode = GET_MODE (dst);
-  rtx x0, e0, e1, y1, u0, v0;
-  enum insn_code code = optab_handler (smul_optab, mode);
-  insn_gen_fn gen_mul = GEN_FCN (code);
-  rtx one = rs6000_load_constant_and_splat (mode, dconst1);
-
-  gcc_assert (code != CODE_FOR_nothing);
-
-  /* x0 = 1./d estimate */
-  x0 = gen_reg_rtx (mode);
-  emit_insn (gen_rtx_SET (VOIDmode, x0,
- gen_rtx_UNSPEC (mode, gen_rtvec (1, d),
- UNSPEC_FRES)));
-
-  e0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (e0, d, x0, one);  /* e0 = 1. - (d * x0) */
-
-  e1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (e1, e0, e0, e0);   /* e1 = (e0 * e0) + e0 */
-
-  y1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y1, e1, x0, x0);   /* y1 = (e1 * x0) + x0 */
-
-  u0 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (u0, n, y1)); /* u0 = n * y1 */
-
-  v0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (v0, d, u0, n);/* v0 = n - (d * u0) */
-
-  rs6000_emit_madd (dst, v0, y1, u0);  /* dst = (v0 * y1) + u0 */
-}
+  rtx one, x0, e0, x1, xprev, eprev, xnext, enext, u, v;
+  int i;
 
-/* Newton-Raphson approximation of floating point divide that has a low
-   precision estimate.  Assumes no trapping math and finite arguments.  */
+  /* Low precision estimates guarantee 5 bits of accuracy.  High
+ precision estimates guarantee 14 bits of accuracy.  SFmode
+ requires 23 bits of accuracy.  DFmode requires 52 bits of
+ accuracy.  Each pass at least doubles the accuracy, leading
+ to the following.  */
+  int passes = (TARGET_RECIP_PRECISION) ? 1 : 3;
+  if (mode == DFmode || mode == V2DFmode)
+passes++;
 
-static void
-rs6000_emit_swdiv_low_precision (rtx dst, rtx n, rtx d)
-{
-  enum machine_mode mode = GET_MODE (dst);
-  rtx x0, e0, e1, e2, y1, y2, y3, u0, v0, one;
   enum insn_code code = optab_handler (smul_optab, mode);
   insn_gen_fn gen_mul = GEN_FCN (code);
 
@@ -29478,46 +29450,44 @@ rs6000_emit_swdiv_low_precision (rtx dst
  gen_rtx_UNSPEC (mode, gen_rtvec (1, d),
  UNSPEC_FRES)));
 
-  e0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (e0, d, x0, one);  /* e0 = 1. - d * x0 */
-
-  y1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y1, e0, x0, x0);   /* y1 = x0 + e0 * x0 */
-
-  e1 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (e1, e0, e0));/* e1 = e0 * e0 */
-
-  y2 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y2, e1, y1, y1);   /* y2 = y1 + e1 * y1 */
-
-  e2 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (e2, e1, e1));/* e2 = e1 * e1 */
-
-  y3 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y3, e2, y2, y2);   /* y3 = y2 + e2 * y2 */
-
-  u0 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (u0, n, y3)); /* u0 = n * y3 */
-
-  v0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (v0, d, u0, n);/* v0 = n - d * u0

[4.8, PATCH 18/26] Backport Power8 and LE support: Configure bits 2

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-le-config-2) backports more configure changes,
particularly for multilib/multiarch targeting powerpc64le.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Apply mainline r202190, powerpc64le multilibs and multiarch dir
2013-09-03  Alan Modra  amo...@gmail.com

* config.gcc (powerpc*-*-linux*): Add support for little-endian
multilibs to big-endian target and vice versa.
* config/rs6000/t-linux64: Use := assignment on all vars.
(MULTILIB_EXTRA_OPTS): Remove fPIC.
(MULTILIB_OSDIRNAMES): Specify using mapping from multilib_options.
* config/rs6000/t-linux64le: New file.
* config/rs6000/t-linux64bele: New file.
* config/rs6000/t-linux64lebe: New file.


Index: gcc-4_8-test/gcc/config.gcc
===
--- gcc-4_8-test.orig/gcc/config.gcc
+++ gcc-4_8-test/gcc/config.gcc
@@ -2081,7 +2081,7 @@ powerpc*-*-linux*)
tmake_file=rs6000/t-fprules rs6000/t-ppcos ${tmake_file} 
rs6000/t-ppccomm
case ${target} in
powerpc*le-*-*)
-   tm_file=${tm_file} rs6000/sysv4le.h ;;
+   tm_file=${tm_file} rs6000/sysv4le.h ;;
esac
maybe_biarch=yes
case ${target} in
@@ -2104,6 +2104,19 @@ powerpc*-*-linux*)
fi
tm_file=rs6000/biarch64.h ${tm_file} rs6000/linux64.h 
glibc-stdint.h
tmake_file=$tmake_file rs6000/t-linux64
+   case ${target} in
+   powerpc*le-*-*)
+   tmake_file=$tmake_file rs6000/t-linux64le
+   case ${enable_targets} in
+   all | *powerpc64-* | *powerpc-*)
+   tmake_file=$tmake_file rs6000/t-linux64lebe ;;
+   esac ;;
+   *)
+   case ${enable_targets} in
+   all | *powerpc64le-* | *powerpcle-*)
+   tmake_file=$tmake_file rs6000/t-linux64bele ;;
+   esac ;;
+   esac
extra_options=${extra_options} rs6000/linux64.opt
;;
*)
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64
===
--- gcc-4_8-test.orig/gcc/config/rs6000/t-linux64
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64
@@ -25,8 +25,8 @@
 # it doesn't tell anything about the 32bit libraries on those systems.  Set
 # MULTILIB_OSDIRNAMES according to what is found on the target.
 
-MULTILIB_OPTIONS= m64/m32
-MULTILIB_DIRNAMES   = 64 32
-MULTILIB_EXTRA_OPTS = fPIC
-MULTILIB_OSDIRNAMES= ../lib64$(call if_multiarch,:powerpc64-linux-gnu)
-MULTILIB_OSDIRNAMES+= $(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:powerpc-linux-gnu)
+MULTILIB_OPTIONS:= m64/m32
+MULTILIB_DIRNAMES   := 64 32
+MULTILIB_EXTRA_OPTS := 
+MULTILIB_OSDIRNAMES := m64=../lib64$(call if_multiarch,:powerpc64-linux-gnu)
+MULTILIB_OSDIRNAMES += m32=$(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:powerpc-linux-gnu)
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64bele
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64bele
@@ -0,0 +1,7 @@
+#rs6000/t-linux64end
+
+MULTILIB_OPTIONS+= mlittle
+MULTILIB_DIRNAMES   += le
+MULTILIB_OSDIRNAMES += $(subst =,.mlittle=,$(subst lible32,lib32le,$(subst 
lible64,lib64le,$(subst lib,lible,$(subst 
-linux,le-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mlittle%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64le
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64le
@@ -0,0 +1,3 @@
+#rs6000/t-linux64le
+
+MULTILIB_OSDIRNAMES := $(subst -linux,le-linux,$(MULTILIB_OSDIRNAMES))
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64lebe
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64lebe
@@ -0,0 +1,7 @@
+#rs6000/t-linux64leend
+
+MULTILIB_OPTIONS+= mbig
+MULTILIB_DIRNAMES   += be
+MULTILIB_OSDIRNAMES += $(subst =,.mbig=,$(subst libbe32,lib32be,$(subst 
libbe64,lib64be,$(subst lib,libbe,$(subst 
le-linux,-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mbig%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}
Index: gcc-4_8-test/libsanitizer/configure.tgt
===
---

[4.8, PATCH 15/26] Backport Power8 and LE support: PR54537

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-pr54537) backports a fix for PR54537 which is unrelated
but necessary.  Copying Richard and Jakub for the common code.

Thanks,
Bill


[libstdc++-v3]

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-08-01  Fabien Chêne  fab...@gcc.gnu.org

PR c++/54537
* include/tr1/cmath: Remove pow(double,double) overload, remove a
duplicated comment about DR 550. Add a comment to explain the issue.
* testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc: New.

[gcc/cp]

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from mainline
2013-08-01  Fabien Chêne  fab...@gcc.gnu.org

PR c++/54537
* cp-tree.h: Check OVL_USED with OVERLOAD_CHECK.
* name-lookup.c (do_nonmember_using_decl): Make sure we have an
OVERLOAD before calling OVL_USED. Call diagnose_name_conflict
instead of issuing an error without mentioning the conflicting
declaration.

[gcc/testsuite]

2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from mainline
2013-08-01  Fabien Chêne  fab...@gcc.gnu.org
Peter Bergner  berg...@vnet.ibm.com

PR c++/54537
* g++.dg/overload/using3.C: New.
* g++.dg/overload/using2.C: Adjust.
* g++.dg/lookup/using9.C: Likewise.


Index: gcc-4_8-test/gcc/cp/cp-tree.h
===
--- gcc-4_8-test.orig/gcc/cp/cp-tree.h
+++ gcc-4_8-test/gcc/cp/cp-tree.h
@@ -331,7 +331,7 @@ typedef struct ptrmem_cst * ptrmem_cst_t
 /* If set, this was imported in a using declaration.
This is not to confuse with being used somewhere, which
is not important for this node.  */
-#define OVL_USED(NODE) TREE_USED (NODE)
+#define OVL_USED(NODE) TREE_USED (OVERLOAD_CHECK (NODE))
 /* If set, this OVERLOAD was created for argument-dependent lookup
and can be freed afterward.  */
 #define OVL_ARG_DEPENDENT(NODE) TREE_LANG_FLAG_0 (OVERLOAD_CHECK
(NODE))
Index: gcc-4_8-test/gcc/cp/name-lookup.c
===
--- gcc-4_8-test.orig/gcc/cp/name-lookup.c
+++ gcc-4_8-test/gcc/cp/name-lookup.c
@@ -2286,8 +2286,7 @@ push_overloaded_decl_1 (tree decl, int f
   compparms (TYPE_ARG_TYPES (TREE_TYPE (fn)),
TYPE_ARG_TYPES (TREE_TYPE (decl)))
   ! decls_match (fn, decl))
-   error (%q#D conflicts with previous using declaration %q#D,
-  decl, fn);
+   diagnose_name_conflict (decl, fn);
 
  dup = duplicate_decls (decl, fn, is_friend);
  /* If DECL was a redeclaration of FN -- even an invalid
@@ -2519,7 +2518,7 @@ do_nonmember_using_decl (tree scope, tre
  if (new_fn == old_fn)
/* The function already exists in the current namespace.  */
break;
- else if (OVL_USED (tmp1))
+ else if (TREE_CODE (tmp1) == OVERLOAD  OVL_USED (tmp1))
continue; /* this is a using decl */
  else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (new_fn)),
  TYPE_ARG_TYPES (TREE_TYPE (old_fn
@@ -2534,7 +2533,7 @@ do_nonmember_using_decl (tree scope, tre
break;
  else
{
- error (%qD is already declared in this scope, name);
+ diagnose_name_conflict (new_fn, old_fn);
  break;
}
}
Index: gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
===
--- gcc-4_8-test.orig/gcc/testsuite/g++.dg/lookup/using9.C
+++ gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
@@ -21,11 +21,11 @@ void h()
   f('h');
   f(1); // { dg-error ambiguous }
   // { dg-message candidate candidate note { target *-*-* } 22 }
-  void f(int);  // { dg-error previous using declaration }
+  void f(int);  // { dg-error previous declaration }
 }
 
 void m()
 {
   void f(int);
-  using B::f;   // { dg-error already declared }
+  using B::f;   // { dg-error previous declaration }
 }
Index: gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
===
--- gcc-4_8-test.orig/gcc/testsuite/g++.dg/overload/using2.C
+++ gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
@@ -45,7 +45,7 @@ using std::C1;
   extern C void exit (int) throw ();
   extern C void *malloc (__SIZE_TYPE__) throw ()
__attribute__((malloc));
 
-  void abort (void) throw ();
+  void abort (void) throw (); // { dg-message previous }
   void _exit (int) throw (); // { dg-error conflicts conflicts }
  // { dg-message void _exit
_exit { target *-*-* } 49 }
 
@@ -54,14 +54,14 @@

[4.8, PATCH 20/26] Backport Power8 and LE support: LRA

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-lra) backports the changes to enable -mlra for the
PowerPC back end.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2014-02-04  Michael Meissner  meiss...@linux.vnet.ibm.com

* config/rs6000/rs6000.opt (-mlra): Add switch to enable the LRA
register allocator.

* config/rs6000/rs6000.c (TARGET_LRA_P): Add support for -mlra to
enable the LRA register allocator.  Back port the changes from the
trunk to enable LRA.
(rs6000_legitimate_offset_address_p): Likewise.
(legitimate_lo_sum_address_p): Likewise.
(use_toc_relative_ref): Likewise.
(rs6000_legitimate_address_p): Likewise.
(rs6000_emit_move): Likewise.
(rs6000_secondary_memory_needed_mode): Likewise.
(rs6000_alloc_sdmode_stack_slot): Likewise.
(rs6000_lra_p): Likewise.

* config/rs6000/sync.md (load_lockedti): Copy TI/PTI variables by
64-bit parts to force the register allocator to allocate even/odd
register pairs for the quad word atomic instructions.
(store_conditionalti): Likewise.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -1,5 +1,5 @@
 /* Subroutines used for code generation on IBM RS/6000.
-   Copyright (C) 1991-2013 Free Software Foundation, Inc.
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
Contributed by Richard Kenner (ken...@vlsi1.ultra.nyu.edu)
 
This file is part of GCC.
@@ -56,6 +56,7 @@
 #include intl.h
 #include params.h
 #include tm-constrs.h
+#include ira.h
 #include opts.h
 #include tree-vectorizer.h
 #include dumpfile.h
@@ -1563,6 +1564,9 @@ static const struct attribute_spec rs600
 #undef TARGET_MODE_DEPENDENT_ADDRESS_P
 #define TARGET_MODE_DEPENDENT_ADDRESS_P rs6000_mode_dependent_address_p
 
+#undef TARGET_LRA_P
+#define TARGET_LRA_P rs6000_lra_p
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE rs6000_can_eliminate
 
@@ -6242,7 +6246,7 @@ rs6000_legitimate_offset_address_p (enum
 return false;
   if (!reg_offset_addressing_ok_p (mode))
 return virtual_stack_registers_memory_p (x);
-  if (legitimate_constant_pool_address_p (x, mode, strict))
+  if (legitimate_constant_pool_address_p (x, mode, strict || lra_in_progress))
 return true;
   if (GET_CODE (XEXP (x, 1)) != CONST_INT)
 return false;
@@ -6383,9 +6387,21 @@ legitimate_lo_sum_address_p (enum machin
 
   if (TARGET_ELF || TARGET_MACHO)
 {
+  bool large_toc_ok;
+
   if (DEFAULT_ABI == ABI_V4  flag_pic)
return false;
-  if (TARGET_TOC)
+  /* LRA don't use LEGITIMIZE_RELOAD_ADDRESS as it usually calls
+push_reload from reload pass code.  LEGITIMIZE_RELOAD_ADDRESS
+recognizes some LO_SUM addresses as valid although this
+function says opposite.  In most cases, LRA through different
+transformations can generate correct code for address reloads.
+It can not manage only some LO_SUM cases.  So we need to add
+code analogous to one in rs6000_legitimize_reload_address for
+LOW_SUM here saying that some addresses are still valid.  */
+  large_toc_ok = (lra_in_progress  TARGET_CMODEL != CMODEL_SMALL
+  small_toc_ref (x, VOIDmode));
+  if (TARGET_TOC  ! large_toc_ok)
return false;
   if (GET_MODE_NUNITS (mode) != 1)
return false;
@@ -6395,7 +6411,7 @@ legitimate_lo_sum_address_p (enum machin
(mode == DFmode || mode == DDmode)))
return false;
 
-  return CONSTANT_P (x);
+  return CONSTANT_P (x) || large_toc_ok;
 }
 
   return false;
@@ -7106,7 +7122,6 @@ use_toc_relative_ref (rtx sym)
ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (sym),
   get_pool_mode (sym)))
  || (TARGET_CMODEL == CMODEL_MEDIUM
-  !CONSTANT_POOL_ADDRESS_P (sym)
   SYMBOL_REF_LOCAL_P (sym)));
 }
 
@@ -7394,7 +7409,8 @@ rs6000_legitimate_address_p (enum machin
   if (reg_offset_p  legitimate_small_data_p (mode, x))
 return 1;
   if (reg_offset_p
-   legitimate_constant_pool_address_p (x, mode, reg_ok_strict))
+   legitimate_constant_pool_address_p (x, mode,
+reg_ok_strict || lra_in_progress))
 return 1;
   /* For TImode, if we have load/store quad and TImode in VSX registers, only
  allow register indirect addresses.  This will allow the values to go in
@@ -7680,6 +7696,7 @@ rs6000_conditional_register_usage (void)
  fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1;
 }
 }
+
 
 /* Try to output insns to set TARGET equal to the constant C if it can
be done in less than N insns.  Do all computations in MODE.
@@ -8112,6 +8129,68 @@

[4.8, PATCH 26/26] Backport Power8 and LE support: Missing support

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-trunk-missing) backports some LE pieces that were found
not to have been backported from trunk to the IBM 4.8 branch until
relatively recently.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from trunk
2013-04-25  Alan Modra  amo...@gmail.com

PR target/57052
* config/rs6000/rs6000.md (rotlsi3_internal7): Rename to
rotlsi3_internal7le and condition on !BYTES_BIG_ENDIAN.
(rotlsi3_internal8be): New BYTES_BIG_ENDIAN insn.
Repeat for many other rotate/shift and mask patterns using subregs.
Name lshiftrt insns.
(ashrdisi3_noppc64): Rename to ashrdisi3_noppc64be and condition
on WORDS_BIG_ENDIAN.

2013-06-07  Alan Modra  amo...@gmail.com

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
override user -mfp-in-toc.
(offsettable_ok_by_alignment): Consider just the current access
rather than the whole object, unless BLKmode.  Handle
CONSTANT_POOL_ADDRESS_P constants that lack a decl too.
(use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants
for -mcmodel=medium.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
override user -mfp-in-toc or -msum-in-toc.  Default to
-mno-fp-in-toc for -mcmodel=medium.

2013-06-18  Alan Modra  amo...@gmail.com

* config/rs6000/rs6000.h (enum data_align): New.
(LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
(DATA_ABI_ALIGNMENT): Define.
(CONSTANT_ALIGNMENT): Correct comment.
* config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
* config/rs6000/rs6000.c (rs6000_data_alignment): New function.

2013-07-11  Ulrich Weigand  ulrich.weig...@de.ibm.com

* config/rs6000/rs6000.md (*tls_gd_lowTLSmode:tls_abi_suffix):
Require GOT register as additional operand in UNSPEC.
(*tls_ld_lowTLSmode:tls_abi_suffix): Likewise.
(*tls_got_dtprel_lowTLSmode:tls_abi_suffix): Likewise.
(*tls_got_tprel_lowTLSmode:tls_abi_suffix): Likewise.
(*tls_gdTLSmode:tls_abi_suffix): Update splitter.
(*tls_ldTLSmode:tls_abi_suffix): Likewise.
(tls_got_dtprel_TLSmode:tls_abi_suffix): Likewise.
(tls_got_tprel_TLSmode:tls_abi_suffix): Likewise.

2014-01-23  Pat Haugen  pthau...@us.ibm.com

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
force flag_ira_loop_pressure if set via command line.

2014-02-06  Alan Modra  amo...@gmail.com

PR target/60032
* config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only
change SDmode to DDmode when lra_in_progress.


Index: gcc-4_8-test/gcc/config/rs6000/linux64.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/linux64.h
+++ gcc-4_8-test/gcc/config/rs6000/linux64.h
@@ -149,8 +149,11 @@ extern int dot_symbols;
SET_CMODEL (CMODEL_MEDIUM); \
  if (rs6000_current_cmodel != CMODEL_SMALL)\
{   \
- TARGET_NO_FP_IN_TOC = 0;  \
- TARGET_NO_SUM_IN_TOC = 0; \
+ if (!global_options_set.x_TARGET_NO_FP_IN_TOC) \
+   TARGET_NO_FP_IN_TOC \
+ = rs6000_current_cmodel == CMODEL_MEDIUM; \
+ if (!global_options_set.x_TARGET_NO_SUM_IN_TOC) \
+   TARGET_NO_SUM_IN_TOC = 0;   \
}   \
}   \
}   \
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-protos.h
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
@@ -152,6 +152,7 @@ extern void rs6000_split_logical (rtx []
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
+extern unsigned int rs6000_data_alignment (tree, unsigned int, enum 
data_align);
 extern unsigned int rs6000_special_round_type_align (tree, unsigned int,
 unsigned int);
 extern unsigned int darwin_rs6000_special_round_type_align (tree, unsigned int,
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -3031,7 +3031,8 @@ rs6000_option_override_internal (bool gl
  calculation works better for RTL loop invariant motion on targets
  with enough (= 32) registers.  It is an expensive optimization.
  So it is on only for peak performance.

[4.8, PATCH 19/26] Backport Power8 and LE support: Quad memory atomic

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-quad-memory) backports support for quad-memory atomic
operations.

Thanks,
Bill


[gcc/testsuite]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from mainline
2014-01-23  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/59909
* gcc.target/powerpc/quad-atomic.c: New file to test power8 quad
word atomic functions at runtime.

[gcc]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from mainline
2014-01-23  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/59909
* doc/invoke.texi (RS/6000 and PowerPC Options): Document
-mquad-memory-atomic.  Update -mquad-memory documentation to say
it is only used for non-atomic loads/stores.

* config/rs6000/predicates.md (quad_int_reg_operand): Allow either
-mquad-memory or -mquad-memory-atomic switches.

* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add
-mquad-memory-atomic to ISA 2.07 support.

* config/rs6000/rs6000.opt (-mquad-memory-atomic): Add new switch
to separate support of normal quad word memory operations (ldq,
stq) from the atomic quad word memory operations.

* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
support to separate non-atomic quad word operations from atomic
quad word operations.  Disable non-atomic quad word operations in
little endian mode so that we don't have to swap words after the
load and before the store.
(quad_load_store_p): Add comment about atomic quad word support.
(rs6000_opt_masks): Add -mquad-memory-atomic to the list of
options printed with -mdebug=reg.

* config/rs6000/rs6000.h (TARGET_SYNC_TI): Use
-mquad-memory-atomic as the test for whether we have quad word
atomic instructions.
(TARGET_SYNC_HI_QI): If either -mquad-memory-atomic,
-mquad-memory, or -mp8-vector are used, allow byte/half-word
atomic operations.

* config/rs6000/sync.md (load_lockedti): Insure that the address
is a proper indexed or indirect address for the lqarx instruction.
On little endian systems, swap the hi/lo registers after the lqarx
instruction.
(load_lockedpti): Use indexed_or_indirect_operand predicate to
insure the address is valid for the lqarx instruction.
(store_conditionalti): Insure that the address is a proper indexed
or indirect address for the stqcrx. instruction.  On little endian
systems, swap the hi/lo registers before doing the stqcrx.
instruction.
(store_conditionalpti): Use indexed_or_indirect_operand predicate to
insure the address is valid for the stqcrx. instruction.

* gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Define __QUAD_MEMORY__ and __QUAD_MEMORY_ATOMIC__ based on what
type of quad memory support is available.


Index: gcc-4_8-test/gcc/config/rs6000/predicates.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/predicates.md
+++ gcc-4_8-test/gcc/config/rs6000/predicates.md
@@ -270,7 +270,7 @@
 {
   HOST_WIDE_INT r;
 
-  if (!TARGET_QUAD_MEMORY)
+  if (!TARGET_QUAD_MEMORY  !TARGET_QUAD_MEMORY_ATOMIC)
 return 0;
 
   if (GET_CODE (op) == SUBREG)
@@ -633,6 +633,7 @@
(match_test offsettable_nonstrict_memref_p (op
 
 ;; Return 1 if the operand is suitable for load/store quad memory.
+;; This predicate only checks for non-atomic loads/stores.
 (define_predicate quad_memory_operand
   (match_code mem)
 {
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-c.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
@@ -337,6 +337,10 @@ rs6000_target_modify_macros (bool define
 rs6000_define_or_undefine_macro (define_p, __HTM__);
   if ((flags  OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, __POWER8_VECTOR__);
+  if ((flags  OPTION_MASK_QUAD_MEMORY) != 0)
+rs6000_define_or_undefine_macro (define_p, __QUAD_MEMORY__);
+  if ((flags  OPTION_MASK_QUAD_MEMORY_ATOMIC) != 0)
+rs6000_define_or_undefine_macro (define_p, __QUAD_MEMORY_ATOMIC__);
   if ((flags  OPTION_MASK_CRYPTO) != 0)
 rs6000_define_or_undefine_macro (define_p, __CRYPTO__);
 
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-cpus.def
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-cpus.def
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-cpus.def
@@ -53,7 +53,8 @@
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_HTM  \
-|

Re: [RFA jit v2 2/2] introduce auto_timevar

2014-03-19 Thread David Malcolm

On Wed, 2014-03-19 at 11:52 -0600, Tom Tromey wrote:
 This introduces a new auto_timevar class.  It pushes a given timevar
 in its constructor, and pops it in the destructor, giving a much
 simpler way to use timevars in the typical case where they can be
 scoped.
 ---
  gcc/ChangeLog.jit  |  4 
  gcc/jit/ChangeLog.jit  |  4 
  gcc/jit/internal-api.c | 16 +---
  gcc/timevar.h  | 26 +-
  4 files changed, 38 insertions(+), 12 deletions(-)

OK (and it fixes a bug in the earlier version of the patch in the dtor,
which pushed rather than popped).

Are you able to push this to my branch yourself, or do you need me to do
this?

[4.8, PATCH 1/26 too big]

2014-03-19 Thread Bill Schmidt

Hi,

The main patch for this series was too large for the mailer to accept.
Sorry about that.  This piece is all powerpc-related and seems to have
been delivered to David ok.  If anyone else wants a copy of the patch,
please contact me privately and I'll send it your way.

Thanks,
Bill

Re: [PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.

2014-03-19 Thread Thomas Schwinge

Hi Ilmir!

On Tue, 18 Mar 2014 16:37:24 +0400, Ilmir Usmanov i.usma...@samsung.com wrote:
 This patch introduces support of OpenACC loop directive (and combined 
 directives) in C front-end up to GENERIC. Currently no clause is allowed.

 --- /dev/null
 +++ b/gcc/testsuite/c-c++-common/goacc/loop-1.c
 @@ -0,0 +1,89 @@
 +/* { dg-do compile } */
 +
 +int test1()
 +{
 +  int i, j, k, b[10];
 +  int a[30];
 +  double d;
 +  float r;
 +  i = 0;

 +  #pragma acc loop
 +  for (i = 1; i  10; i++)
 +{
 +}

Do you intend to support loop constructs that are not nested in a
parallel or kernels construct?  As I'm reading it, the specification is
not clear on this.  (I guess I'll raise this question with the OpenACC
guys.)


Grüße,
 Thomas


pgpJV43AkyNA2.pgp
Description: PGP signature

[4.8, PATCH 24/26] Backport Power8 and LE support: Reload issues

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-reload) backports fixes for a couple of problems in
PowerPC reload handling.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Apply mainline r207798
2014-02-26  Alan Modra  amo...@gmail.com
PR target/58675
PR target/57935
* config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use
find_replacement on parts of insn rtl that might be reloaded.

Backport from mainline r208287
2014-03-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

* config/rs6000/rs6000.c (rs6000_preferred_reload_class): Disallow
reload of PLUS rtx's outside of GENERAL_REGS or BASE_REGS; relax
constraint on constants to permit them being loaded into
GENERAL_REGS or BASE_REGS.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -16380,7 +16380,7 @@ rs6000_secondary_reload_inner (rtx reg,
 rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
   rclass = REGNO_REG_CLASS (regno);
-  addr = XEXP (mem, 0);
+  addr = find_replacement (XEXP (mem, 0));
 
   switch (rclass)
 {
@@ -16391,19 +16391,18 @@ rs6000_secondary_reload_inner (rtx reg,
   if (GET_CODE (addr) == AND)
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (XEXP (addr, 0));
}
 
   if (GET_CODE (addr) == PRE_MODIFY)
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (XEXP (addr, 0));
  if (!REG_P (scratch_or_premodify))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (GET_CODE (addr) == PLUS
@@ -16411,6 +16410,8 @@ rs6000_secondary_reload_inner (rtx reg,
  || !rs6000_legitimate_offset_address_p (PTImode, addr,
  false, true)))
{
+ /* find_replacement already recurses into both operands of
+PLUS so we don't need to call it here.  */
  addr_op1 = XEXP (addr, 0);
  addr_op2 = XEXP (addr, 1);
  if (!legitimate_indirect_address_p (addr_op1, false))
@@ -16486,7 +16487,7 @@ rs6000_secondary_reload_inner (rtx reg,
  || !VECTOR_MEM_ALTIVEC_P (mode)))
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (XEXP (addr, 0));
}
 
   /* If we aren't using a VSX load, save the PRE_MODIFY register and use it
@@ -16498,14 +16499,13 @@ rs6000_secondary_reload_inner (rtx reg,
  || and_op2 != NULL_RTX
  || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (XEXP (addr, 0));
  if (!legitimate_indirect_address_p (scratch_or_premodify, false))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (legitimate_indirect_address_p (addr, false)  /* reg */
@@ -16765,8 +16765,14 @@ rs6000_preferred_reload_class (rtx x, en
easy_vector_constant (x, mode))
 return ALTIVEC_REGS;
 
-  if (CONSTANT_P (x)  reg_classes_intersect_p (rclass, FLOAT_REGS))
-return NO_REGS;
+  if ((CONSTANT_P (x) || GET_CODE (x) == PLUS))
+{
+  if (reg_class_subset_p (GENERAL_REGS, rclass))
+   return GENERAL_REGS;
+  if (reg_class_subset_p (BASE_REGS, rclass))
+   return BASE_REGS;
+  return NO_REGS;
+}
 
   if (GET_MODE_CLASS (mode) == MODE_INT  rclass == NON_SPECIAL_REGS)
 return GENERAL_REGS;

[4.8, PATCH 22/26] Backport Power8 and LE support: -mcall-* endianness

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-mcall) fixes big-endian assumptions for -mcall-aixdesc
and various others.

Thanks,
Bill


2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r207658
2014-02-06  Ulrich Weigand  ulrich.weig...@de.ibm.com

* config/rs6000/sysv4.h (ENDIAN_SELECT): Do not attempt to enforce
big-endian mode for -mcall-aixdesc, -mcall-freebsd, -mcall-netbsd,
-mcall-openbsd, or -mcall-linux.
(CC1_ENDIAN_BIG_SPEC): Remove.
(CC1_ENDIAN_LITTLE_SPEC): Remove.
(CC1_ENDIAN_DEFAULT_SPEC): Remove.
(CC1_SPEC): Remove (always empty) %cc1_endian_... spec.
(SUBTARGET_EXTRA_SPECS): Remove %cc1_endian_big, %cc1_endian_little,
and %cc1_endian_default.
* config/rs6000/sysv4le.h (CC1_ENDIAN_DEFAULT_SPEC): Remove.


Index: gcc-4_8-test/gcc/config/rs6000/sysv4.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/sysv4.h
+++ gcc-4_8-test/gcc/config/rs6000/sysv4.h
@@ -522,8 +522,6 @@ extern int fixuplabelno;
 #define ENDIAN_SELECT(BIG_OPT, LITTLE_OPT, DEFAULT_OPT)\
 %{mlittle|mlittle-endian:LITTLE_OPT ;  \
   mbig|mbig-endian:  BIG_OPT;  \
-  mcall-aixdesc|mcall-freebsd|mcall-netbsd|  \
-  mcall-openbsd|mcall-linux: BIG_OPT;  \
   mcall-i960-old:LITTLE_OPT ;  \
   :  DEFAULT_OPT }
 
@@ -536,20 +534,12 @@ extern int fixuplabelno;
 %{memb|msdata=eabi: -memb} \
 ENDIAN_SELECT( -mbig,  -mlittle, DEFAULT_ASM_ENDIAN)
 
-#defineCC1_ENDIAN_BIG_SPEC 
-
-#defineCC1_ENDIAN_LITTLE_SPEC 
-
-#defineCC1_ENDIAN_DEFAULT_SPEC %(cc1_endian_big)
-
 #ifndef CC1_SECURE_PLT_DEFAULT_SPEC
 #define CC1_SECURE_PLT_DEFAULT_SPEC 
 #endif
 
-/* Pass -G xxx to the compiler and set correct endian mode.  */
+/* Pass -G xxx to the compiler.  */
 #defineCC1_SPEC %{G*} %(cc1_cpu) \
-  ENDIAN_SELECT( %(cc1_endian_big),  %(cc1_endian_little), \
-%(cc1_endian_default))   \
 %{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
 %{mrelocatable: -meabi } \
@@ -903,9 +893,6 @@ ncrtn.o%s
   { link_os_netbsd,  LINK_OS_NETBSD_SPEC },  \
   { link_os_openbsd, LINK_OS_OPENBSD_SPEC }, \
   { link_os_default, LINK_OS_DEFAULT_SPEC }, \
-  { cc1_endian_big,  CC1_ENDIAN_BIG_SPEC },  \
-  { cc1_endian_little,   CC1_ENDIAN_LITTLE_SPEC },   \
-  { cc1_endian_default,  CC1_ENDIAN_DEFAULT_SPEC },  \
   { cc1_secure_plt_default,  CC1_SECURE_PLT_DEFAULT_SPEC },  \
   { cpp_os_ads,  CPP_OS_ADS_SPEC },  \
   { cpp_os_yellowknife,  CPP_OS_YELLOWKNIFE_SPEC },  \
Index: gcc-4_8-test/gcc/config/rs6000/sysv4le.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/sysv4le.h
+++ gcc-4_8-test/gcc/config/rs6000/sysv4le.h
@@ -22,9 +22,6 @@
 #undef  TARGET_DEFAULT
 #define TARGET_DEFAULT MASK_LITTLE_ENDIAN
 
-#undef CC1_ENDIAN_DEFAULT_SPEC
-#defineCC1_ENDIAN_DEFAULT_SPEC %(cc1_endian_little)
-
 #undef DEFAULT_ASM_ENDIAN
 #defineDEFAULT_ASM_ENDIAN  -mlittle

[4.8, PATCH 23/26] Backport Power8 and LE support: PR60137, PR60203

2014-03-19 Thread Bill Schmidt

Hi,

This patch (diff-pr60137-pr60203) backports fixes for two little-endian
vector mode problems.

Thanks,
Bill


[gcc]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r207699.
2014-02-11  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60137
* config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter
for VSX/Altivec vectors that land in GPR registers.

Backport from mainline r207808.
2014-02-15  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60203
* config/rs6000/rs6000.md (rreg): Add TFmode, TDmode constraints.
(movmode_internal, TFmode/TDmode): Split TFmode/TDmode moves
into 64-bit and 32-bit moves.  On 64-bit moves, add support for
using direct move instructions on ISA 2.07.  Also adjust
instruction length for 64-bit.
(movmode_64bit, TFmode/TDmode): Likewise.
(movmode_32bit, TFmode/TDmode): Likewise.

Backport from mainline r207868.
2014-02-18  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60203
* config/rs6000/rs6000.md (movmode_64bit, TF/TDmode moves):
Split 64-bit moves into 2 patterns.  Do not allow the use of
direct move for TDmode in little endian, since the decimal value
has little endian bytes within a word, but the 64-bit pieces are
ordered in a big endian fashion, and normal subreg's of TDmode are
not allowed.
(movmode_64bit_dm): Likewise.
(movtd_64bit_nodm): Likewise.

[gcc/testsuite]

2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r207699.
2014-02-11  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60137
* gcc.target/powerpc/pr60137.c: New file.

Backport from mainline r207808.
2014-02-15  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60203
* gcc.target/powerpc/pr60203.c: New testsuite.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.md
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.md
@@ -378,6 +378,8 @@
 
 (define_mode_attr rreg [(SF   f)
(DF   ws)
+   (TF   f)
+   (TD   f)
(V4SF wf)
(V2DF wd)])
 
@@ -8990,10 +8992,40 @@
 ;; It's important to list Y-r and r-Y before r-r because otherwise
 ;; reload, given m-r, will try to pick r-r and reload it, which
 ;; doesn't make progress.
-(define_insn_and_split *movmode_internal
+
+;; We can't split little endian direct moves of TDmode, because the words are
+;; not swapped like they are for TImode or TFmode.  Subregs therefore are
+;; problematical.  Don't allow direct move for this case.
+
+(define_insn_and_split *movmode_64bit_dm
+  [(set (match_operand:FMOVE128 0 nonimmediate_operand =m,d,d,Y,r,r,r,wm)
+   (match_operand:FMOVE128 1 input_operand d,m,d,r,YGHF,r,wm,r))]
+  TARGET_HARD_FLOAT  TARGET_FPRS  TARGET_POWERPC64
+(MODEmode != TDmode || WORDS_BIG_ENDIAN)
+(gpc_reg_operand (operands[0], MODEmode)
+   || gpc_reg_operand (operands[1], MODEmode))
+  #
+   reload_completed
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+  [(set_attr length 8,8,8,12,12,8,8,8)])
+
+(define_insn_and_split *movtd_64bit_nodm
+  [(set (match_operand:TD 0 nonimmediate_operand =m,d,d,Y,r,r)
+   (match_operand:TD 1 input_operand d,m,d,r,YGHF,r))]
+  TARGET_HARD_FLOAT  TARGET_FPRS  TARGET_POWERPC64  !WORDS_BIG_ENDIAN
+(gpc_reg_operand (operands[0], TDmode)
+   || gpc_reg_operand (operands[1], TDmode))
+  #
+   reload_completed
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+  [(set_attr length 8,8,8,12,12,8)])
+
+(define_insn_and_split *movmode_32bit
   [(set (match_operand:FMOVE128 0 nonimmediate_operand =m,d,d,Y,r,r)
(match_operand:FMOVE128 1 input_operand d,m,d,r,YGHF,r))]
-  TARGET_HARD_FLOAT  TARGET_FPRS
+  TARGET_HARD_FLOAT  TARGET_FPRS  !TARGET_POWERPC64
 (gpc_reg_operand (operands[0], MODEmode)
|| gpc_reg_operand (operands[1], MODEmode))
   #
@@ -9429,6 +9461,15 @@
   [(set_attr length 12)
(set_attr type three)])
 
+(define_split
+  [(set (match_operand:FMOVE128_GPR 0 nonimmediate_operand )
+   (match_operand:FMOVE128_GPR 1 input_operand ))]
+  reload_completed
+(int_reg_operand (operands[0], MODEmode)
+   || int_reg_operand (operands[1], MODEmode))
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+
 ;; Move SFmode to a VSX from a GPR register.  Because scalar floating point
 ;; type is stored internally as double precision in the VSX registers, we have
 ;; to convert it from the vector format.
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr60137.c

Re: [Fortran][PATCH][gomp4]: Transform OpenACC loop directive

2014-03-19 Thread Tobias Burnus


Hi Illmir,

Ilmir Usmanov:
This patch implements transformation of OpenACC loop directive from 
Fortran AST to GENERIC.


If I followed correctly, with this patch the Fortran FE implementation 
of OpenACC is complete, except for:


* !$acc cache() - parsing supported, but then aborting with a 
not-implemented error

* OpenACC 2.0a additions.

Am I right?

Successfully bootstrapped and tested with no new regressions on 
x86_64-unknown-linux-gnu.

OK for gomp4 branch?


I leave the review of gcc/tree-pretty-print.c part (looks good to me) to 
Thomas, who might have also a comment to the Fortran part.


For a DO loop, the code looks okay.


For DO CONCURRENT, it is not. I think we should really consider to 
reject DO CONCURRENT with a not permitted; it is currently not 
explicitly supported by OpenACC; I think we can still worry about it, 
when it will be explicitly added to OpenACC. Otherwise, see 
gfc_trans_do_concurrent for how to handle the do concurrent loops.


Issues with DO CONCURRENT:

* You use code-ext.iterator-var - that's fine with DO but not with 
DO CONCURRENT, which uses a code-ext.forall_iterator


* Do concurrent also handles multiple variables in a single statement, 
such as:


integer :: i, j, b(3,5)
DO CONCURRENT(i=1:3, j=1:5:2)
  b(i, j) = -42
END DO
end

* And do concurrent also supports masks:

logical :: my_mask(3)
integer :: i, b(3)
b(i) = [5, 5, 2]
my_mask = [.true., .false., .true.]
do concurrent (i=1:3, b(i) == 5 .and. my_mask(i))
  b(i) = -42
end do
end

Tobias

Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Tobias Burnus

Early *ping*  - I think this wrong-code GCC 4.7/4.8/4.9 issue is pretty 
severe.


Tobias Burnus wrote:
This patch fixes two issues, where gfortran claims that a function is 
implicit pure, but it is not. That will cause a wrong-code 
optimization in the middle end.


First problem, cf. PR60543, is that implicit pure was not set to 0 for 
calls to impure intrinsic subroutines. (BTW: There are no impure 
intrinsic functions.) Example:


  module m
  contains
REAL(8) FUNCTION random()
  CALL RANDOM_NUMBER(random)
END FUNCTION random
  end module m


The second problem pops up if one adds a BLOCK ... END BLOCK around 
the random_number call after applying the patch of the PR, which just 
does: gfc_current_ns-proc_name-attr.implicit_pure = 0.


The problem is that one sets only the implicit_pure of the block to 0 
and not of the function. That's the reason that the patch became much 
longer and that I added gfc_unset_implicit_pure as new function.


Thus, the suspicion I had when reviewing the OpenACC patches turned 
out to be founded. Cf. PR60283.


Build and regtested on x86-64-gnu-linux.
OK for the trunk and for the 4.7 and 4.8 branches?

Note: I failed to create a test case.

Tobias

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Jakub Jelinek

On Wed, Mar 19, 2014 at 02:23:58PM -0500, Bill Schmidt wrote:
 Support for Power8 features and the new powerpc64le-linux-gnu target,
 including the ELFv2 ABI, has been developed up till now on the
 ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
 while the support was unstable, but this branch will not represent a
 particularly good support mechanism for distributions going forward.
 Most distros are set up to pull from the major release branches, and
 having a separate branch for one target is quite inconvenient.  Also,
 the ibm/gcc-4_8-branch's original purpose is to serve as the code base
 for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
 branch currently serves will diverge and make things even more
 complicated.
 
 The code is now tested and stable enough that we are ready to backport
 this support to the FSF 4.8 branch.  This patch series constitutes that
 backport.

I guess the most important question is what guarantees there are that it
won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
plus the C++ FE / libstdc++ changes), and how much does this affect
code generation and overall stability of the PowerPC big endian existing
targets.

Jakub

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread David Edelsohn

On Wed, Mar 19, 2014 at 4:05 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Mar 19, 2014 at 02:23:58PM -0500, Bill Schmidt wrote:
 Support for Power8 features and the new powerpc64le-linux-gnu target,
 including the ELFv2 ABI, has been developed up till now on the
 ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
 while the support was unstable, but this branch will not represent a
 particularly good support mechanism for distributions going forward.
 Most distros are set up to pull from the major release branches, and
 having a separate branch for one target is quite inconvenient.  Also,
 the ibm/gcc-4_8-branch's original purpose is to serve as the code base
 for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
 branch currently serves will diverge and make things even more
 complicated.

 The code is now tested and stable enough that we are ready to backport
 this support to the FSF 4.8 branch.  This patch series constitutes that
 backport.

 I guess the most important question is what guarantees there are that it
 won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
 plus the C++ FE / libstdc++ changes), and how much does this affect
 code generation and overall stability of the PowerPC big endian existing
 targets.

Before this patch is approved, we are going to thoroughly confirm that
it does not harm any other PowerPC targets (big endian PowerLinux,
eABI, nor AIX). Any help with testng from the PPC eABI community is
appreciated.

- David

Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Paul Richard Thomas

Dear Tobias,

The patch looks OK to me.  If nothing else, it offers a
rationalisation of all the lines of code that unset the attribute!

I am somewhat puzzled by Note: I failed to create a test case,
wheras I find one at the end of the patch.  Can you explain what you
mean?

Cheers

Paul

On 19 March 2014 21:21, Tobias Burnus bur...@net-b.de wrote:
 Early *ping*  - I think this wrong-code GCC 4.7/4.8/4.9 issue is pretty
 severe.


 Tobias Burnus wrote:

 This patch fixes two issues, where gfortran claims that a function is
 implicit pure, but it is not. That will cause a wrong-code optimization in
 the middle end.

 First problem, cf. PR60543, is that implicit pure was not set to 0 for
 calls to impure intrinsic subroutines. (BTW: There are no impure intrinsic
 functions.) Example:

   module m
   contains
 REAL(8) FUNCTION random()
   CALL RANDOM_NUMBER(random)
 END FUNCTION random
   end module m


 The second problem pops up if one adds a BLOCK ... END BLOCK around the
 random_number call after applying the patch of the PR, which just does:
 gfc_current_ns-proc_name-attr.implicit_pure = 0.

 The problem is that one sets only the implicit_pure of the block to 0 and
 not of the function. That's the reason that the patch became much longer and
 that I added gfc_unset_implicit_pure as new function.

 Thus, the suspicion I had when reviewing the OpenACC patches turned out to
 be founded. Cf. PR60283.

 Build and regtested on x86-64-gnu-linux.
 OK for the trunk and for the 4.7 and 4.8 branches?

 Note: I failed to create a test case.

 Tobias





-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy

[C++ PATCH] Fix ICE in build_zero_init_1 (PR c++/60572)

2014-03-19 Thread Jakub Jelinek

Hi!

On the following testcase starting with r199779 we have a FIELD_DECL with
error_mark_node type, on which we ICE.  Fixed by ignoring such FIELD_DECLs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-03-19  Jakub Jelinek  ja...@redhat.com

PR c++/60572
* init.c (build_zero_init_1): Ignore fields with error_mark_node
type.

* g++.dg/init/pr60572.C: New test.

--- gcc/cp/init.c.jj2014-03-10 10:50:14.0 +0100
+++ gcc/cp/init.c   2014-03-19 07:43:54.077795662 +0100
@@ -192,6 +192,9 @@ build_zero_init_1 (tree type, tree nelts
  if (TREE_CODE (field) != FIELD_DECL)
continue;
 
+ if (TREE_TYPE (field) == error_mark_node)
+   continue;
+
  /* Don't add virtual bases for base classes if they are beyond
 the size of the current field, that means it is present
 somewhere else in the object.  */
--- gcc/testsuite/g++.dg/init/pr60572.C.jj  2014-03-19 07:46:33.607894844 
+0100
+++ gcc/testsuite/g++.dg/init/pr60572.C 2014-03-19 07:46:49.752804722 +0100
@@ -0,0 +1,13 @@
+// PR c++/60572
+// { dg-do compile }
+
+struct A
+{
+  A x; // { dg-error incomplete type }
+  virtual ~A () {}
+};
+
+struct B : A
+{
+  B () : A () {}
+};

Jakub

Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey

David OK.  Are you able to push this to my branch, or do you need me to do
David this?

Thanks, I was able to push them.

Tom

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt

On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:

 I guess the most important question is what guarantees there are that it
 won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
 plus the C++ FE / libstdc++ changes), and how much does this affect
 code generation and overall stability of the PowerPC big endian existing
 targets.
 
   Jakub
 

The three pieces that are somewhat controversial for non-powerpc targets
are 9/26, 10/26, 15/26.

 * Uli and Alan, can you speak to any concerns for 9/26?

 * 10/26 hits libstdc++, but only in a minor way for the extract_symvers
script; it adds a sed to ignore a string added for powerpc64le, so
shouldn't be a problem.

 * 15/26 might be one we can do without.  I need to check with Peter
Bergner, who originally backported Fabien's patch, but unfortunately he
is on vacation.  That patch fixed a problem that originated on an x86
platform.  I can try respinning the patch series without this one and
see what breaks, or if Peter happens to see this while he's on vacation,
perhaps he can comment.

For PowerPC targets, I have already checked out powerpc64-linux (big
endian).  As David mentioned, I need to apply the patch series on an AIX
machine and test it before this can be accepted.  We don't have any way
of testing the eabi stuff, so community help would be very much
appreciated there.

Thanks,
Bill

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt

On Wed, 2014-03-19 at 16:03 -0500, Bill Schmidt wrote:
 On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:
 
  I guess the most important question is what guarantees there are that it
  won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
  plus the C++ FE / libstdc++ changes), and how much does this affect
  code generation and overall stability of the PowerPC big endian existing
  targets.
  
  Jakub
  
 
 The three pieces that are somewhat controversial for non-powerpc targets
 are 9/26, 10/26, 15/26.

I forgot to mention that these bits have all been upstream in trunk
since last autumn, so there's been quite a bit of burn-in at that level.
Obviously that is not the same as being burned in on 4.8, but it does
help provide a bit of confidence.

Bill

 
  * Uli and Alan, can you speak to any concerns for 9/26?
 
  * 10/26 hits libstdc++, but only in a minor way for the extract_symvers
 script; it adds a sed to ignore a string added for powerpc64le, so
 shouldn't be a problem.
 
  * 15/26 might be one we can do without.  I need to check with Peter
 Bergner, who originally backported Fabien's patch, but unfortunately he
 is on vacation.  That patch fixed a problem that originated on an x86
 platform.  I can try respinning the patch series without this one and
 see what breaks, or if Peter happens to see this while he's on vacation,
 perhaps he can comment.
 
 For PowerPC targets, I have already checked out powerpc64-linux (big
 endian).  As David mentioned, I need to apply the patch series on an AIX
 machine and test it before this can be accepted.  We don't have any way
 of testing the eabi stuff, so community help would be very much
 appreciated there.
 
 Thanks,
 Bill

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Jeff Law


On 03/19/14 15:03, Bill Schmidt wrote:

On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:


I guess the most important question is what guarantees there are that it
won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
plus the C++ FE / libstdc++ changes), and how much does this affect
code generation and overall stability of the PowerPC big endian existing
targets.

Jakub



The three pieces that are somewhat controversial for non-powerpc targets
are 9/26, 10/26, 15/26.

  * Uli and Alan, can you speak to any concerns for 9/26?
I've got no concerns about 9/26.  Uli, Alan and myself worked through 
this pretty thoroughly.  I've had those in the back of my mind as 
something we're going to want to make sure to pull in.


Jeff

PR libstdc++/60587

2014-03-19 Thread Jonathan Wakely


I'm debugging http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60587 and
have found a number of problems.

Firstly, the bug report is correct, this overload dereferences the
__other argument without checking if that is OK:

  templatetypename _Iterator, typename _Sequence, typename _InputIterator
inline bool
__foreign_iterator_aux3(const _Safe_iterator_Iterator, _Sequence __it,
  _InputIterator __other,
  std::true_type)

Secondly, in this testcase we should never even have reached that
overload, because we should have gone to this overload of _aux2:

  templatetypename _Iterator, typename _Sequence, typename _OtherIterator
inline bool
__foreign_iterator_aux2(const _Safe_iterator_Iterator, _Sequence __it,
const _Safe_iterator_OtherIterator, _Sequence __other,
std::input_iterator_tag)
{ return __it._M_get_sequence() != __other._M_get_sequence(); }

However that is not chosen by overload resolution because this is a better
match when __other is non-const:

  templatetypename _Iterator, typename _Sequence, typename _InputIterator
inline bool
__foreign_iterator_aux2(const _Safe_iterator_Iterator, _Sequence __it,
  _InputIterator __other,
  std::random_access_iterator_tag)

Fixing the overload resolution bug makes the testcase in the PR pass,
but the underlying problem of dereferencing an invalid iterator still
exists and can be shown by changing the testcase slightly:

#define _GLIBCXX_DEBUG
#include vector
int main() {
std::vectorint a;
std::vectorlong b;
a.push_back(1);
a.insert(a.end(), b.begin(), b.end());
}

That still dereferences b.begin(), but that too can be fixed (either
as suggested in the PR or by passing the begin and end iterators into
the __foreign_iter function) but I think there's still another
problem.

I'm looking again at the code that attempts to check if we have
contiguous storage:

  if (std::addressof(*(__it._M_get_sequence()-_M_base().end() - 1))
  - std::addressof(*(__it._M_get_sequence()-_M_base().begin()))
  == __it._M_get_sequence()-size() - 1)

Are we really sure that ensures contiguous iterators? What if we have
a deque with three blocks laid out in memory like this:
 
 1XXX3XXx2XXX

 ^  ^
 begin()end()

1 is the start of the first block, 2 is the start of the second block
and 3 is the start of the third block.
X is an element, x is reserved but uninitialized capacity
. is unallocated memory (or memory not used by the deque)

Here we have end() - begin() == size() but non-contiguous memory.
If the __other iterator happens to point to the unallocated memory
between 1 and 3 then it will appear to be part of the deque, but
isn't.

I think the safe thing to do is (as I suggested at the time) to have a
trait saying which iterator types refer to contiguous memory. Our
debug mode only supports our own containers, so the ones which are
contiguous are known.  For 4.9.0 I think the right option is simply
to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
completely. The fixed version of __foreign_iterator_aux2() can detect
when we have iterators referring to the same sequence, which is what
we really want to detect. That's what the attached patch does and what
I'm going to test.


--- debug/functions.h.orig  2014-03-19 21:34:43.038647394 +
+++ debug/functions.h   2014-03-19 21:35:53.502617461 +
@@ -175,62 +175,6 @@
   return __first;
 }
 
-#if __cplusplus = 201103L
-  // Default implementation.
-  templatetypename _Iterator, typename _Sequence
-inline bool
-__foreign_iterator_aux4(const _Safe_iterator_Iterator, _Sequence __it,
-   typename _Sequence::const_pointer __begin,
-   typename _Sequence::const_pointer __other)
-{
-  typedef typename _Sequence::const_pointer _PointerType;
-  constexpr std::less_PointerType __l{};
-
-  return (__l(__other, __begin)
- || __l(std::addressof(*(__it._M_get_sequence()-_M_base().end()
- - 1)), __other));
-}
-
-  // Fallback when address type cannot be implicitely casted to sequence
-  // const_pointer.
-  templatetypename _Iterator, typename _Sequence,
-  typename _InputIterator
-inline bool
-__foreign_iterator_aux4(const _Safe_iterator_Iterator, _Sequence,
-   _InputIterator, ...)
-{ return true; }
-
-  templatetypename _Iterator, typename _Sequence, typename _InputIterator
-inline bool
-__foreign_iterator_aux3(const _Safe_iterator_Iterator, _Sequence __it,
-   _InputIterator __other,
-   std::true_type)
-{
-  // Only containers with all elements in contiguous memory can have their
-  // elements passed through pointers.
-  // Arithmetics is here just to make sure we are not dereferencing
-  // past-the-end iterator.
-  if

Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Tobias Burnus


Paul Richard Thomas wrote:

The patch looks OK to me.  If nothing else, it offers a
rationalisation of all the lines of code that unset the attribute!

I am somewhat puzzled by Note: I failed to create a test case,
wheras I find one at the end of the patch.  Can you explain what you
mean?


What I meant was that I failed to create a run-time test case, which 
fails without the patch. However, after I wrote that, I saw that there 
is a dg-* which permits to check the .mod file for a string. That's why 
I could include a test case.


Committed to the trunk as Rev. 208687.

While looking at the patch again for backporting, I saw that I have 
missed the following parts. I will commit them tomorrow as obvious, 
unless someone protests.


Tobias
2014-03-19  Tobias Burnus  burnus@net-b.

	PR fortran/60543
	* io.c (check_io_constraints): Use gfc_unset_implicit_pure.
	* resolve.c (resolve_ordinary_assign): Ditto.

Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c	(Revision 208687)
+++ gcc/fortran/io.c	(Arbeitskopie)
@@ -3259,9 +3259,8 @@ if (condition) \
 		 an internal file in a PURE procedure,
 		 io_kind_name (k));
 
-  if (gfc_implicit_pure (NULL)  (k == M_READ || k == M_WRITE))
-	gfc_current_ns-proc_name-attr.implicit_pure = 0;
-
+  if (k == M_READ || k == M_WRITE)
+	gfc_unset_implicit_pure (NULL);
 }
 
   if (k != M_READ)
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(Revision 208687)
+++ gcc/fortran/resolve.c	(Arbeitskopie)
@@ -9165,7 +9165,7 @@ resolve_ordinary_assign (gfc_code *code, gfc_names
   if (lhs-expr_type == EXPR_VARIABLE
 	 lhs-symtree-n.sym != gfc_current_ns-proc_name
 	 lhs-symtree-n.sym-ns != gfc_current_ns)
-	gfc_current_ns-proc_name-attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 
   if (lhs-ts.type == BT_DERIVED
 	 lhs-expr_type == EXPR_VARIABLE
@@ -9173,11 +9173,11 @@ resolve_ordinary_assign (gfc_code *code, gfc_names
 	 rhs-expr_type == EXPR_VARIABLE
 	 (gfc_impure_variable (rhs-symtree-n.sym)
 		|| gfc_is_coindexed (rhs)))
-	gfc_current_ns-proc_name-attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 
   /* Fortran 2008, C1283.  */
   if (gfc_is_coindexed (lhs))
-	gfc_current_ns-proc_name-attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 }
 
   /* F2008, 7.2.1.2.  */

Re: PR libstdc++/60587

2014-03-19 Thread Jonathan Wakely


On 19/03/14 21:39 +, Jonathan Wakely wrote:

I think the safe thing to do is (as I suggested at the time) to have a
trait saying which iterator types refer to contiguous memory. Our
debug mode only supports our own containers, so the ones which are
contiguous are known.  For 4.9.0 I think the right option is simply
to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
completely. The fixed version of __foreign_iterator_aux2() can detect
when we have iterators referring to the same sequence, which is what
we really want to detect. That's what the attached patch does and what
I'm going to test.


With my suggested change we get an XPASS for
testsuite/23_containers/vector/debug/57779_neg.cc

An __is_contiguous trait would solve that.

Re: PR libstdc++/60587

2014-03-19 Thread Paolo Carlini

Hi

 On 19/mar/2014, at 23:28, Jonathan Wakely jwak...@redhat.com wrote:
 
 On 19/03/14 21:39 +, Jonathan Wakely wrote:
 I think the safe thing to do is (as I suggested at the time) to have a
 trait saying which iterator types refer to contiguous memory. Our
 debug mode only supports our own containers, so the ones which are
 contiguous are known.  For 4.9.0 I think the right option is simply
 to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
 completely. The fixed version of __foreign_iterator_aux2() can detect
 when we have iterators referring to the same sequence, which is what
 we really want to detect. That's what the attached patch does and what
 I'm going to test.
 
 With my suggested change we get an XPASS for
 testsuite/23_containers/vector/debug/57779_neg.cc
 
 An __is_contiguous trait would solve that.

Funny, I thought we already had it...

Paolo

[patch committed SH] Fix target/60039

2014-03-19 Thread Kaz Kojima

I've committed the attached patch to fix PR target/60039
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60039
which is a regression from 4.5 for some sh3 users.
Tested on sh4-unknown-linux-gnu with -mdiv=call-div1.
I'd like to backport it to 4.8 in a week or two as usual.

Regards,
kaz
--
2014-03-19  Kaz Kojima  kkoj...@gcc.gnu.org

PR target/60039
* config/sh/sh.md (udivsi3_i1): Clobber R1 register.

--- ORIG/trunk/gcc/config/sh/sh.md  2014-03-02 09:49:58.0 +0900
+++ trunk/gcc/config/sh/sh.md   2014-03-18 14:43:26.515319735 +0900
@@ -2314,6 +2314,7 @@
(udiv:SI (reg:SI R4_REG) (reg:SI R5_REG)))
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
+   (clobber (reg:SI R1_REG))
(clobber (reg:SI R4_REG))
(use (match_operand:SI 1 arith_reg_operand r))]
   TARGET_SH1  TARGET_DIVIDE_CALL_DIV1

Re: [C++ PATCH] [gomp4] Initial OpenACC support to C++ front-end

2014-03-19 Thread Joseph S. Myers

On Thu, 13 Mar 2014, Ilmir Usmanov wrote:

   * gcc/testsuite/c-c++-common/goacc/deviceptr-1.c: Move to ...
   * gcc/testsuite/gcc.dg/goacc/deviceptr-1.c ... here.
   * gcc/testsuite/g++.dg/goacc/goacc.exp: New test directory.
   * gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp: Likewise.

The ChangeLog file is in gcc/testsuite/, so paths should be given
relative to that directory (i.e. without the gcc/testsuite/ part).

   gcc/testsuite/g++.dg/goacc/
   * deviceptr-1.cpp: New test.
   * sb-1.cpp: Likewise.
   * sb-2.cpp: Likewise. 

Here, each entry should contain the g++.dg/goacc/ part.  And the
ChangeLog entry should be updated for the change in filenames to *.C.

 +  for (t = vars; t  t; t = TREE_CHAIN (t))

This use of t  t seems odd.

 +  c_parser_omp_var_list_parens() should construct a list of

No use of () when referring to a function in a comment.

 +static tree
 +cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 +const char *where, cp_token *pragma_tok, 
 +bool finish_p = true)

No caller seems to set this finish_p argument, so I don't see a need
for it.

 +/* OpenACC 2.0:
 +   # pragma acc data oacc-data-clause[optseq] new-line
 + structured-block
 +
 +   LOC is the location of the #pragma token.
 +*/

 +static tree
 +cp_parser_oacc_data (cp_parser *parser, cp_token *pragma_tok)

There's no parameter LOC, so it seems wrong for the comment to mention
one.  (This applies to other functions with such a comment as well.)

Observations on the tests: I don't see anything testing diagnostics
for the case where it's a return statement that branches out of a
block for which isn't not permitted (you have tests for goto and
switch statements doing such branches) - is that because such tests
are also missing for C?

There are questions of how OpenACC constructs interact with C++
features not present in C.  I think a lot of such questions would
apply more to the implementation of the routine directive than to the
things in this patch (as there may well be C++ features not readily
supported on an accelerator).  For the features in this patch, I
suppose exception handling is another form of invalid jump out of a
structured block, but it must be considered undefined behavior at
runtime because it can't be detected at compile time.  I guess
something to include in the testsuite is testing use of OpenACC
directives within templates.  Thus, you have a diagnostic for
non-pointer variables being used in a deviceptr clause; the testsuite
should verify that if the clause is used within a template, and the
type of the variable depends on the type for which a template is
instantiated, you only get the error for an instantiation giving it a
non-pointer type, not if all instantiations give it a pointer type.
(Generally, this applies to any check of something that can only be
determined for a particular instantiation.)

-- 
Joseph S. Myers
jos...@codesourcery.com

1 2 >

1 - 100 of 103 matches

Mail list logo