date:20111026

Re: Reload related segfaults

2011-10-26 Thread David Miller

From: David Miller 
Date: Wed, 26 Oct 2011 23:07:11 -0400 (EDT)

> From: Alan Modra 
> Date: Thu, 27 Oct 2011 13:29:56 +1030
> 
>> Some recent patch has exposed a reload bug.  I'm seeing
> 
> I think this might be a side effect or Eric's recent changes,
> CC:'d.

Eric, I'm seeing a similar segmentation fault in reload on sparc64.
But it's in a slightly different place than Alan's crash.

Simply compile gcc.target/ultrasp12.c with "-m64 -O2 -mcpu=ultrasparc -mvis"
to see this.

The crash is in find_valid_class() called from push_reload(), via this
code block around line 1184 of reload.c:

  enum reg_class in_out_class
= find_valid_class (outmode, GET_MODE (SUBREG_REG (out)),
subreg_regno_offset (REGNO (SUBREG_REG (out)),
 GET_MODE (SUBREG_REG (out)),
 SUBREG_BYTE (out),
 GET_MODE (out)),
REGNO (SUBREG_REG (out)));

'out' is:

(subreg:DI (reg/v:V4QI 50 %f18 [orig:314 s2hi4_ ] [314]) 0)

so subreg_regno_offset() returns -1, and find_valid_class() isn't too happy
about getting "-1" for it's 'n' argument.

I suspect the test that you removed in order to fix rtl-optimization/46603
would guard against this happening.

And indeed, I confirmed that backing out the rtl-optimization/46603
fix makes the segmentation fault go away.

[PATCH] PR49720, infinite recursion in RTX simplification

2011-10-26 Thread Chung-Lin Tang

Hi,
this patch is for a specific case in RTX simplification where infinite
recursion ensues, causing the out-of-stack segfault in PR49720.

Tracking back the origin of this bug, the exact revision causing this
was rev.92429, back in 2004.

The clause in simplify-rtx.c:simplify_relational_operation_1(),
transforming X + cst1 == cst2 into X == cst2 - cst1 can cause an
oscillating recursive cycle like this:

  A + B == C<===>   C - B == A

When all A, B, and C are CONSTANT_P, but not simplifiable (in the PR
testcase, they are all SYMBOL_REFs).  The switching of XEXP(op0,1) to
the other side of the equation, plus commutative canonicalizing then
completes the cycle.

This can be solved by using simplify_binary_relation() and returning
NULL_RTX when simplification of the constant PLUS/MINUS fails, but I'm
not sure that further working on X == CST2-CST1 will never yield more
optimization, so the patch here just tries to plug the recursion hole
for the exact condition.

Bootstrapped and tested on i686 and x86_64 without regressions. Is this
okay for trunk?

Thanks,
Chung-Lin

2011-10-27  Chung-Lin Tang  

PR rtl-optimization/49720
* simplify-rtx.c (simplify_relational_operation_1): Detect
infinite recursion condition in "(eq/ne (plus x cst1) cst2)
simplifies to (eq/ne x (cst2 - cst1))" case.

testsuite/
* g++.dg/torture/pr49720.C: New test.
Index: trunk/gcc/simplify-rtx.c
===
--- trunk/gcc/simplify-rtx.c(revision 180421)
+++ trunk/gcc/simplify-rtx.c(working copy)
@@ -4352,10 +4352,20 @@
 {
   rtx x = XEXP (op0, 0);
   rtx c = XEXP (op0, 1);
+  enum rtx_code invcode = op0code == PLUS ? MINUS : PLUS;
+  rtx tem = simplify_gen_binary (invcode, cmp_mode, op1, c);
 
-  c = simplify_gen_binary (op0code == PLUS ? MINUS : PLUS,
-  cmp_mode, op1, c);
-  return simplify_gen_relational (code, mode, cmp_mode, x, c);
+  /* Detect an infinite recursive condition, where we oscillate at this
+simplification case between:
+   A + B == C  <--->  C - B == A,
+where A, B, and C are all constants with non-simplifiable expressions,
+usually SYMBOL_REFs.  */
+  if (GET_CODE (tem) == invcode
+ && CONSTANT_P (x)
+ && rtx_equal_p (c, XEXP (tem, 1)))
+   return NULL_RTX;
+
+  return simplify_gen_relational (code, mode, cmp_mode, x, tem);
 }
 
   /* (ne:SI (zero_extract:SI FOO (const_int 1) BAR) (const_int 0))) is
Index: trunk/gcc/testsuite/g++.dg/torture/pr49720.C
===
--- trunk/gcc/testsuite/g++.dg/torture/pr49720.C(revision 0)
+++ trunk/gcc/testsuite/g++.dg/torture/pr49720.C(revision 0)
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+
+extern char t_start[], t_end[], t_size[];
+bool foo (void)
+{
+  long size = reinterpret_cast(t_size);
+  return (size == t_end - t_start);
+}

[PATCH, i386]: Avoid combining registers from different units in a single alternative.

2011-10-26 Thread Uros Bizjak

Hello!

See comment above inline_secondary_memory_needed function in i386.c.

2011-08-27  Uros Bizjak  

PR target/37191
* config/i386/sse.md (*vec_extract_v4sf_mem): Avoid combining registers
from different units in a single alternative.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: sse.md
===
--- sse.md  (revision 180559)
+++ sse.md  (working copy)
@@ -3866,6 +3866,62 @@
   DONE;
 })
 
+(define_insn_and_split "*sse4_1_extractps"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,x,x")
+   (vec_select:SF
+ (match_operand:V4SF 1 "register_operand" "x,0,x")
+ (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n,n")])))]
+  "TARGET_SSE4_1"
+  "@
+   %vextractps\t{%2, %1, %0|%0, %1, %2}
+   #
+   #"
+  "&& reload_completed && SSE_REG_P (operands[0])"
+  [(const_int 0)]
+{
+  rtx dest = gen_rtx_REG (V4SFmode, REGNO (operands[0]));
+  switch (INTVAL (operands[2]))
+{
+case 1:
+case 3:
+  emit_insn (gen_sse_shufps_v4sf (dest, operands[1], operands[1],
+ operands[2], operands[2],
+ GEN_INT (INTVAL (operands[2]) + 4),
+ GEN_INT (INTVAL (operands[2]) + 4)));
+  break;
+case 2:
+  emit_insn (gen_vec_interleave_highv4sf (dest, operands[1], operands[1]));
+  break;
+default:
+  /* 0 should be handled by the *vec_extractv4sf_0 pattern above.  */
+  gcc_unreachable ();
+}
+  DONE;
+}
+  [(set_attr "isa" "*,noavx,avx")
+   (set_attr "type" "sselog,*,*")
+   (set_attr "prefix_data16" "1,*,*")
+   (set_attr "prefix_extra" "1,*,*")
+   (set_attr "length_immediate" "1,*,*")
+   (set_attr "prefix" "maybe_vex,*,*")
+   (set_attr "mode" "V4SF,*,*")])
+
+(define_insn_and_split "*vec_extract_v4sf_mem"
+  [(set (match_operand:SF 0 "register_operand" "=x,*r,f")
+   (vec_select:SF
+(match_operand:V4SF 1 "memory_operand" "o,o,o")
+(parallel [(match_operand 2 "const_0_to_3_operand" "n,n,n")])))]
+  "TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  int i = INTVAL (operands[2]);
+
+  emit_move_insn (operands[0], adjust_address (operands[1], SFmode, i*4));
+  DONE;
+})
+
 (define_expand "avx_vextractf128"
   [(match_operand: 0 "nonimmediate_operand" "")
(match_operand:V_256 1 "register_operand" "")
@@ -4044,62 +4100,6 @@
(set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
-(define_insn_and_split "*sse4_1_extractps"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,x,x")
-   (vec_select:SF
- (match_operand:V4SF 1 "register_operand" "x,0,x")
- (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n,n")])))]
-  "TARGET_SSE4_1"
-  "@
-   %vextractps\t{%2, %1, %0|%0, %1, %2}
-   #
-   #"
-  "&& reload_completed && SSE_REG_P (operands[0])"
-  [(const_int 0)]
-{
-  rtx dest = gen_rtx_REG (V4SFmode, REGNO (operands[0]));
-  switch (INTVAL (operands[2]))
-{
-case 1:
-case 3:
-  emit_insn (gen_sse_shufps_v4sf (dest, operands[1], operands[1],
- operands[2], operands[2],
- GEN_INT (INTVAL (operands[2]) + 4),
- GEN_INT (INTVAL (operands[2]) + 4)));
-  break;
-case 2:
-  emit_insn (gen_vec_interleave_highv4sf (dest, operands[1], operands[1]));
-  break;
-default:
-  /* 0 should be handled by the *vec_extractv4sf_0 pattern above.  */
-  gcc_unreachable ();
-}
-  DONE;
-}
-  [(set_attr "isa" "*,noavx,avx")
-   (set_attr "type" "sselog,*,*")
-   (set_attr "prefix_data16" "1,*,*")
-   (set_attr "prefix_extra" "1,*,*")
-   (set_attr "length_immediate" "1,*,*")
-   (set_attr "prefix" "maybe_vex,*,*")
-   (set_attr "mode" "V4SF,*,*")])
-
-(define_insn_and_split "*vec_extract_v4sf_mem"
-  [(set (match_operand:SF 0 "register_operand" "=x*rf")
-   (vec_select:SF
-(match_operand:V4SF 1 "memory_operand" "o")
-(parallel [(match_operand 2 "const_0_to_3_operand" "n")])))]
-  "TARGET_SSE"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int i = INTVAL (operands[2]);
-
-  emit_move_insn (operands[0], adjust_address (operands[1], SFmode, i*4));
-  DONE;
-})
-
 ;; Modes handled by vec_extract patterns.
 (define_mode_iterator VEC_EXTRACT_MODE
   [(V32QI "TARGET_AVX") V16QI

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill


On 10/26/2011 07:00 PM, Paolo Carlini wrote:

.. maybe my message wasn't clear, sorry, I'm a bit tired (here it's
late): I meant to say that the non_reference tweak fixes the
non-template impl class case, but something more is needed for a
template impl (thus the new testcase). And, additionally, this issue is
a [4.6/4.7 Regression], thus, post 4.6.2, we may be interested in back
porting something.


Ah.


Let me know if you need additional details, or what else.


How is the template case failing?


Or we want me to look into a different way to attack the template case.


Please.

Jason

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor

Rainer Orth  writes:

> the problem is another one: using /usr/xpg4/bin/awk, I find:
>
> /usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function "m"
>
> nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
> state:
>
> `match(STRING, REGEXP [, ARRAY])'
> [...]
>  The ARRAY argument to `match' is a `gawk' extension.  In
>  compatibility mode (*note Options::), using a third argument is a
>  fatal error.

I committed this patch which should fix this problem.  Bootstrapped and
ran Go testsuite on x86_64-unknown-linux-gnu.

Ian

diff -r 5f4f4eae5cd9 libgo/go/syscall/mksyscall.awk
--- a/libgo/go/syscall/mksyscall.awk	Wed Oct 26 16:19:57 2011 -0700
+++ b/libgo/go/syscall/mksyscall.awk	Wed Oct 26 21:53:05 2011 -0700
@@ -44,33 +44,63 @@
 	blocking = 1
 }
 
-if (match($0, "//sys(nb)?[ 	]*([a-zA-Z0-9_]+)\\(([^()]*)\\) *(\\(([^()]+)\\))?", gosig) == 0) {
+line = $0
+
+if (match(line, "//sys(nb)?[ 	]*[a-zA-Z0-9_]+\\([^()]*\\) *(\\(([^()]+)\\))?") == 0) {
 	print "unmatched line:", $0 | "cat 1>&2"
 	status = 1
 	next
 }
 
-gofnname = gosig[2]
-gofnparams = gosig[3]
-gofnresults = gosig[5]
+# Sets a[1] = //sysnb, a[2] == function name.
+split(line, a, "[ 	(]*")
+gofnname = a[2]
+
+off = match(line, "\\([^()]*\\)")
+end = index(substr(line, off, length(line) - off + 1), ")")
+gofnparams = substr(line, off + 1, end - 2)
+
+line = substr(line, off + end, length(line) - (off + end) + 1)
+off = match(line, "\\([^()]*\\)")
+if (off == 0) {
+	gofnresults = ""
+} else {
+	end = index(substr(line, off, length(line) - off + 1), ")")
+	gofnresults = substr(line, off + 1, end - 2)
+}
 
 getline
+line = $0
 
-if (match($0, "//([a-zA-Z0-9_]+)\\(([^()]*)\\) *(.*)$", csig) == 0) {
+if (match(line, "//[a-zA-Z0-9_]+\\([^()]*\\)") == 0) {
 	print "unmatched C line", $0, "after", gofnname | "cat 1>&2"
 	status = 1
 	next
 }
 
-cfnname = csig[1]
-cfnparams = csig[2]
-cfnresult = csig[3]
+split(line, a, "[ 	(]*")
+cfnname = substr(a[1], 3, length(a[1]) - 2)
+
+off = match(line, "\\([^()]*\\)")
+end = index(substr(line, off, length(line) - off + 1), ")")
+cfnparams = substr(line, off + 1, end - 2)
+
+line = substr(line, off + end + 1, length(line) - (off + end) + 1)
+while (substr(line, 1, 1) == " ") {
+	line = substr(line, 2, length(line) - 1)
+}
+end = index(line, " ")
+if (end != 0) {
+	line = substr(line, 1, end)
+}
+cfnresult = line
 
 printf("// Automatically generated wrapper for %s/%s\n", gofnname, cfnname)
 printf("func c_%s(%s) %s%s__asm__(\"%s\")\n",
 	   cfnname, cfnparams, cfnresult, cfnresult == "" ? "" : " ", cfnname)
-printf("func %s(%s) %s%s{\n",
-	   gofnname, gofnparams, gosig[4], gosig[4] == "" ? "" : " ")
+printf("func %s(%s) %s%s%s%s{\n",
+	   gofnname, gofnparams, gofnresults == "" ? "" : "(", gofnresults,
+	   gofnresults == "" ? "" : ")", gofnresults == "" ? "" : " ")
 
 if (blocking) {
 	print "\tentersyscall()"
@@ -91,22 +121,22 @@
 	args = args ", "
 	}
 
-	if (match(goargs[goarg], "^([^ ]*) ([^ ]*)$", goparam) == 0) {
+	if (split(goargs[goarg], a) != 2) {
 	print loc, "bad parameter:", goargs[goarg] | "cat 1>&2"
 	status = 1
 	next
 	}
 
-	goname = goparam[1]
-	gotype = goparam[2]
+	goname = a[1]
+	gotype = a[2]
 
-	if (match(cargs[carg], "^([^ ]*) ([^ ]*)$", cparam) == 0) {
+	if (split(cargs[carg], a) != 2) {
 	print loc, "bad C parameter:", cargs[carg] | "cat 1>&2"
 	status = 1
 	next
 	}
 
-	ctype = cparam[2]
+	ctype = a[2]
 
 	if (gotype ~ /^\*/) {
 	if (gotype != ctype) {

[PATCH] Fix thinko in previous sparc setcc changes.

2011-10-26 Thread David Miller


More extensive testing showed that we have to force the usage of
v9 scc patterns when comparing DImode values with comparison codes
other than EQ and NE.

Soon we'll be able to add exceptions this this, because VIS3 has addxc
and addxccc instructions which test the 64-bit carry condition.

Committed to trunk.

gcc/

* config/sparc/sparc.c (emit_scc_insn): Force attempt of v9 sequences
if we're comparing DImode and comparison is other than EQ or NE.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180558 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog|3 +++
 gcc/config/sparc/sparc.c |5 +
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3a51510..124b17c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2011-10-26  David S. Miller  
 
+   * config/sparc/sparc.c (emit_scc_insn): Force attempt of v9 sequences
+   if we're comparing DImode and comparison is other than EQ or NE.
+
* config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
LEU/LTU/GEU/GTU is attempted.
* config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 80e05a6..ea9fdef 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2541,6 +2541,11 @@ emit_scc_insn (rtx operands[])
 }
 }
 
+  if (TARGET_V9
+  && GET_MODE (x) == DImode
+  && gen_v9_scc (operands[0], code, x, y))
+return true;
+
   /* We can do LTU and GEU using the addx/subx instructions too.  And
  for GTU/LEU, if both operands are registers swap them and fall
  back to the easy case.  */
-- 
1.7.6.401.g6a319

Re: Reload related segfaults

2011-10-26 Thread David Miller

From: Alan Modra 
Date: Thu, 27 Oct 2011 13:29:56 +1030

> Some recent patch has exposed a reload bug.  I'm seeing

I think this might be a side effect or Eric's recent changes,
CC:'d.

Reload related segfaults

2011-10-26 Thread Alan Modra

Some recent patch has exposed a reload bug.  I'm seeing

libtool: compile:  /home/amodra/build/gcc-curr/./gcc/xgcc 
-B/home/amodra/build/gcc-curr/./gcc/ -B/home/amodra/gnu/powerpc-linux/bin/ 
-B/home/amodra/gnu/powerpc-linux/lib/ -isystem 
/home/amodra/gnu/powerpc-linux/include -isystem 
/home/amodra/gnu/powerpc-linux/sys-include -m64 -fPIC -mstrict-align 
-DHAVE_CONFIG_H -I. 
-I/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm -I../../include 
-fexceptions -fasynchronous-unwind-tables -g -O2 -m64 -fPIC -mstrict-align -MT 
sf_fabs.lo -MD -MP -MF .deps/sf_fabs.Tpo -c 
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c  -fPIC 
-DPIC -o .libs/sf_fabs.o
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c: In 
function 'fabsf':
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c:33:1: 
internal compiler error: Segmentation fault

(insn 11 9 15 2 (parallel [
(set (subreg:SI (reg:SF 123 [  ]) 0)
(and:SI (subreg:SI (reg:SF 33 1 [ x ]) 0)
(const_int 2147483647 [0x7fff])))
(clobber (scratch:CC))
]) 
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c:32 146 
{andsi3_mc}
 (expr_list:REG_DEAD (reg:SF 33 1 [ x ])
(nil)))

Reloads for insn # 11
Reload 0: GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
reload_reg_rtx: (reg:SF 8 8)
Reload 1: GENERAL_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
reload_reg_rtx: (reg:SF 8 8)
Reload 2: reload_in (SF) = (reg:SF 33 1 [orig:123  ] [123])
reload_out (SF) = (reg:SF 33 1 [orig:123  ] [123])
NON_FLOAT_REGS, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg:SF 33 1 [orig:123  ] [123])
reload_out_reg: (reg:SF 33 1 [orig:123  ] [123])
reload_reg_rtx: (reg:SF 10 10)
secondary_in_reload = 0, secondary_out_reload = 1

Reload 3: reload_out (SI) = (subreg:SI (reg:SF 33 1 [orig:123  ] [123]) 
0)
GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (subreg:SI (reg:SF 33 1 [orig:123  ] [123]) 0)
reload_reg_rtx: (reg:SI 9 9)
Reload 4: reload_in (SF) = (reg:SF 33 1 [ x ])
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1)
reload_in_reg: (reg:SF 33 1 [ x ])
reload_reg_rtx: (reg:SF 8 8)

Notice reload 3 has a subreg for reload_out.  We wind up in the
gen_reload code shown below, and try to use REGNO on a subreg, leading
to

(gdb) p debug_rtx(out)
(reg:DI -166922608)

Or at least, that's what you get after fixing print_rtx to not
segfault..  Bootstrap and regression test powerpc-linux in progress.
OK to apply, assuming no regressions?

* reload1.c (gen_reload): Don't use REGNO on SUBREGs.
* print-rtl.c (print_rtx): Don't segfault on negative regno.

Note that s/REGNO (in_rtx)/value/ in print_rtx is reasonable given
that REG in rtl.def has a format of "i00", so XINT (in_rtx, i) is
always XINT (in_rtx, 0) for a reg, which is equivalent to REGNO apart
from signedness.

Index: gcc/reload1.c
===
--- gcc/reload1.c   (revision 180542)
+++ gcc/reload1.c   (working copy)
@@ -8588,28 +8588,28 @@ gen_reload (rtx out, rtx in, int opnum, 
   && reg_or_subregno (in) < FIRST_PSEUDO_REGISTER
   && (REG_P (out)
   || (GET_CODE (out) == SUBREG && REG_P (SUBREG_REG (out
   && reg_or_subregno (out) < FIRST_PSEUDO_REGISTER
   && SECONDARY_MEMORY_NEEDED (REGNO_REG_CLASS (reg_or_subregno (in)),
   REGNO_REG_CLASS (reg_or_subregno (out)),
   GET_MODE (out)))
 {
   /* Get the memory to use and rewrite both registers to its mode.  */
   rtx loc = get_secondary_mem (in, GET_MODE (out), opnum, type);
 
   if (GET_MODE (loc) != GET_MODE (out))
-   out = gen_rtx_REG (GET_MODE (loc), REGNO (out));
+   out = gen_rtx_REG (GET_MODE (loc), reg_or_subregno (out));
 
   if (GET_MODE (loc) != GET_MODE (in))
-   in = gen_rtx_REG (GET_MODE (loc), REGNO (in));
+   in = gen_rtx_REG (GET_MODE (loc), reg_or_subregno (in));
 
   gen_reload (loc, in, opnum, type);
   gen_reload (out, loc, opnum, type);
 }
 #endif
   else if (REG_P (out) && UNARY_P (in))
 {
   rtx insn;
   rtx op1;
   rtx out_moded;
   rtx set;
 
Index: gcc/print-rtl.c
===
--- gcc/print-rtl.c (revision 180542)
+++ gcc/print-rtl.c (working copy)
@@ -465,13 +465,12 @@ print_rtx (const_rtx in_rtx)
int value = XINT (in_rtx, i);
const char *name;
 
 #ifndef GENERATOR_FILE
-   if (REG_P (in_rtx) && value < FIRST_PSEUDO_REGISTER)
- fprintf (outfile, " %d %s", REGNO (in_rtx),
-  reg_names[REGNO (in_rtx)]);
+   if (REG_P (in_rtx)

[pph] Fix some cgraph node merge problems (issue5302068)

2011-10-26 Thread Diego Novillo


This patch fixes some of the problems we had with cgraph nodes for
merged functions.  When merging a function that had a cgraph node emitted 
for it, we were ICEing during cgraph allocation because the reader
was not checking whether the node already existed for that function.

Additionally, the patch removes the computation of hash values for
include branches.  That was only papering over hashing problems.
Instead, I made the hashing key more descriptive by using
decl_as_string and the mangled name.

We still have several merging problems.  The x4tmplfun* and z4tmplfun*
failures are different ICEs in template instantiation.  Still have not
looked what those are.

Tested on x86_64.  Committed.


Diego.


* cp-tree.h (get_mangled_id): Declare.
* mangle.c (get_mangled_id): Factor out of ...
(mangle_decl): ... here.
* pph-streamer-in.c
(pph_get_include_path_hash): Remove.  Update all users.
* pph-streamer-out.c (pph_merge_name): Move from pph-streamer.c.
Change return type to char *.
Call get_mangled_id and decl_as_string.
Update users.

testsuite/ChangeLog.pph

* g++.dg/pph/c4inline.cc: Mark fixed.
* g++.dg/pph/x4tmplclass1.cc: Likewise.
* g++.dg/pph/x4tmplclass2.cc: Likewise.
* g++.dg/pph/z4tmplclass1.cc: Likewise.
* g++.dg/pph/z4tmplclass2.cc: Likewise.
* g++.dg/pph/x4tmplfuncinln.cc: Change expected failure.
* g++.dg/pph/x4tmplfuncninl.cc: Likewise.
* g++.dg/pph/z4tmplfuncinln.cc: Likewise.
* g++.dg/pph/z4tmplfuncninl.cc: Likewise.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 705c0a6..297a779 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5782,6 +5782,7 @@ extern tree merge_exception_specifiers(tree, 
tree, tree);
 /* in mangle.c */
 extern void init_mangle(void);
 extern void mangle_decl(tree);
+extern tree get_mangled_id (tree);
 extern const char *mangle_type_string  (tree);
 extern tree mangle_typeinfo_for_type   (tree);
 extern tree mangle_typeinfo_string_for_type(tree);
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 55851e6..38a1fcb 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3157,13 +3157,21 @@ mangle_decl_string (const tree decl)
   return result;
 }
 
+/* Return an identifier for the external mangled name of DECL.  */
+
+tree
+get_mangled_id (tree decl)
+{
+  tree id = mangle_decl_string (decl);
+  return targetm.mangle_decl_assembler_name (decl, id);
+}
+
 /* Create an identifier for the external mangled name of DECL.  */
 
 void
 mangle_decl (const tree decl)
 {
-  tree id = mangle_decl_string (decl);
-  id = targetm.mangle_decl_assembler_name (decl, id);
+  tree id = get_mangled_id (decl);
   SET_DECL_ASSEMBLER_NAME (decl, id);
 
   if (G.need_abi_warning)
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 6c8bec4..6aa301f 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -67,7 +67,7 @@ static VEC(char_p,heap) *string_tables = NULL;
 static int pph_loc_offset;
 
 /* Forward declarations to avoid circularity.  */
-static tree pph_in_merge_key_tree (pph_stream *, tree *, hashval_t);
+static tree pph_in_merge_key_tree (pph_stream *, tree *);
 
 /* stream initialization */
 
@@ -690,18 +690,17 @@ pph_in_chain (pph_stream *stream)
 
 
 /* Read a chain of AST merge keys from STREAM.  Merge each tree
-   into *CHAIN.  IPATH_HASH is the hash value of the include path
-   from STREAM to the root of the include tree.  */
+   into *CHAIN.  */
 
 static void
-pph_in_merge_key_chain (pph_stream *stream, tree *chain, hashval_t ipath_hash)
+pph_in_merge_key_chain (pph_stream *stream, tree *chain)
 {
   unsigned i;
   HOST_WIDE_INT count;
 
   count = pph_in_hwi (stream);
   for (i = 0; i < count; i++)
-pph_in_merge_key_tree (stream, chain, ipath_hash);
+pph_in_merge_key_tree (stream, chain);
 }
 
 
@@ -740,13 +739,6 @@ typedef struct {
 
   /* Name of the tree (from pph_merge_name).  */
   const char *name;
-
-  /* Hash value representing the include path starting at the image
- where EXPR resides up to the root of the include tree.  Objects
- found in any of these PPH images do not need to be merged.  They
- were already emitted as external references and merged when
- the PPH images were being generated.  */
-  hashval_t ipath_hash;
 } merge_toc_entry;
 
 
@@ -767,11 +759,6 @@ htab_merge_key_eq (const void *p1, const void *p2)
   const merge_toc_entry *key1 = (const merge_toc_entry *) p1;
   const merge_toc_entry *key2 = (const merge_toc_entry *) p2;
 
-  /* Matches inside the same include path are not interesting.  These
- symbols have already been merged.  */
-  if (key1->ipath_hash == key2->ipath_hash)
-return false;
-
   if (key1->context != key2->context)
 return false;
 
@@ -831,

[PATCH] Add sparc fmaf test.

2011-10-26 Thread David Miller


Committed to trunk.

gcc/testsuite/

* gcc.target/sparc/fmaf-1.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180551 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog |2 +
 gcc/testsuite/gcc.target/sparc/fmaf-1.c |   51 +++
 2 files changed, 53 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/fmaf-1.c

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2879dfe..6078fdd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,7 @@
 2011-10-26  David S. Miller  
 
+   * gcc.target/sparc/fmaf-1.c: New test.
+
* gcc.target/sparc/setcc-1.c: New test.
* gcc.target/sparc/setcc-2.c: New test.
 
diff --git a/gcc/testsuite/gcc.target/sparc/fmaf-1.c 
b/gcc/testsuite/gcc.target/sparc/fmaf-1.c
new file mode 100644
index 000..948b926
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/fmaf-1.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mfmaf" } */
+
+float fmadds (float a, float b, float c)
+{
+  return a * b + c;
+}
+
+float fmsubs (float a, float b, float c)
+{
+  return a * b - c;
+}
+
+float fnmadds (float a, float b, float c)
+{
+  return -(a * b + c);
+}
+
+float fnmsubs (float a, float b, float c)
+{
+  return -(a * b - c);
+}
+
+double fmaddd (double a, double b, double c)
+{
+  return a * b + c;
+}
+
+double fmsubd (double a, double b, double c)
+{
+  return a * b - c;
+}
+
+double fnmaddd (double a, double b, double c)
+{
+  return -(a * b + c);
+}
+
+double fnmsubd (double a, double b, double c)
+{
+  return -(a * b - c);
+}
+
+/* { dg-final { scan-assembler "fmadds\t%" } } */
+/* { dg-final { scan-assembler "fmsubs\t%" } } */
+/* { dg-final { scan-assembler "fnmadds\t%" } } */
+/* { dg-final { scan-assembler "fnmsubs\t%" } } */
+/* { dg-final { scan-assembler "fmaddd\t%" } } */
+/* { dg-final { scan-assembler "fmsubd\t%" } } */
+/* { dg-final { scan-assembler "fnmaddd\t%" } } */
+/* { dg-final { scan-assembler "fnmsubd\t%" } } */
-- 
1.7.6.401.g6a319

[PATCH] Improve sparc setcc generation and add testcases.

2011-10-26 Thread David Miller


This makes sure the addx/subx sequence is used even when v9 conditional
moves are available.

It also shows the compiler that the setcc instructions emitting
-1, 0, or 1 all extend to DImode.

I left all the "*x_{plus,minus}_foo" and related patterns alone for
now, I'll get to those and add appropriate test cases as I find time.

Committed to trunk.

gcc/

* config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
LEU/LTU/GEU/GTU is attempted.
* config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
and split.
(*neg_seqsi_sign_extend): Likewise.
(*sltu_extend_sp64, *neg_sltu_extend_sp64, *sgeu_extend_sp64,
*neg_sgeu_extend_sp64): New insns.

gcc/testsuite/

* gcc.target/sparc/setcc-1.c: New test.
* gcc.target/sparc/setcc-2.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180550 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog|8 
 gcc/config/sparc/sparc.c |   14 +++
 gcc/config/sparc/sparc.md|   59 ++
 gcc/testsuite/ChangeLog  |5 +++
 gcc/testsuite/gcc.target/sparc/setcc-1.c |   39 
 gcc/testsuite/gcc.target/sparc/setcc-2.c |   39 
 6 files changed, 156 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d6d1382..3a51510 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,13 @@
 2011-10-26  David S. Miller  
 
+   * config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
+   LEU/LTU/GEU/GTU is attempted.
+   * config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
+   and split.
+   (*neg_seqsi_sign_extend): Likewise.
+   (*sltu_extend_sp64, *neg_sltu_extend_sp64, *sgeu_extend_sp64,
+   *neg_sgeu_extend_sp64): New insns.
+
* config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
* config/sparc/sparc.md (movcc, movcc): Call it.
(*mov_cc_v9): Normalize to expect operand 0 always in operand 4.
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 964bcaf..80e05a6 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2541,14 +2541,6 @@ emit_scc_insn (rtx operands[])
 }
 }
 
-  /* For the rest, on v9 we can use conditional moves.  */
-
-  if (TARGET_V9)
-{
-  if (gen_v9_scc (operands[0], code, x, y))
-return true;
-}
-
   /* We can do LTU and GEU using the addx/subx instructions too.  And
  for GTU/LEU, if both operands are registers swap them and fall
  back to the easy case.  */
@@ -2573,6 +2565,12 @@ emit_scc_insn (rtx operands[])
   return true;
 }
 
+  /* All the posibilities to use addx/subx based sequences has been
+ exhausted, try for a 3 instruction sequence using v9 conditional
+ moves.  */
+  if (TARGET_V9 && gen_v9_scc (operands[0], code, x, y))
+return true;
+
   /* Nope, do branches.  */
   return false;
 }
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 2bae28e..2b4b2bb 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -713,6 +713,22 @@
   ""
   [(set_attr "length" "2")])
 
+(define_insn_and_split "*neg_snesi_sign_extend"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(neg:DI (ne:DI (match_operand:SI 1 "register_operand" "r")
+  (const_int 0
+   (clobber (reg:CC CC_REG))]
+  "TARGET_ARCH64"
+  "#"
+  "&& 1"
+  [(set (reg:CC_NOOV CC_REG) (compare:CC_NOOV (minus:SI (const_int 0)
+ (match_dup 1))
+   (const_int 0)))
+   (set (match_dup 0) (sign_extend:DI (neg:SI (ltu:SI (reg:CC CC_REG)
+  (const_int 0)]
+  ""
+  [(set_attr "length" "2")])
+
 (define_insn_and_split "*snedi_zero"
   [(set (match_operand:DI 0 "register_operand" "=&r")
 (ne:DI (match_operand:DI 1 "register_operand" "r")
@@ -804,6 +820,21 @@
   ""
   [(set_attr "length" "2")])
 
+(define_insn_and_split "*neg_seqsi_sign_extend"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (neg:DI (eq:DI (match_operand:SI 1 "register_operand" "r")
+  (const_int 0
+   (clobber (reg:CC CC_REG))]
+  "TARGET_ARCH64"
+  "#"
+  "&& 1"
+  [(set (reg:CC_NOOV CC_REG) (compare:CC_NOOV (neg:SI (match_dup 1))
+  (const_int 0)))
+   (set (match_dup 0) (sign_extend:DI (neg:SI (geu:SI (reg:CC CC_REG)
+  (const_int 0)]
+  ""
+  [(set_attr "length" "2")])
+
 (define_insn_and_split "*seqdi_zero"
   [(set (match_operand:DI 0 "register_operand" "=&r")
 (eq:DI (match_operand:DI 1 "register_ope

Re: libstdc++/50862 fix deadlock in condition_variable_any

2011-10-26 Thread Jonathan Wakely

PR libstdc++/50862
* include/std/condition_variable (condition_variable_any::wait): Avoid
terminating if relocking user mutex throws during stack-unwinding.
* testsuite/30_threads/condition_variable_any/50862.cc: Add dg-require.

Fixes two more issues pointed out in the PR comments.

Tested x86_64-linux, committed to trunk.
Index: include/std/condition_variable
===
--- include/std/condition_variable  (revision 180456)
+++ include/std/condition_variable  (working copy)
@@ -205,7 +205,13 @@
// scoped unlock - unlocks in ctor, re-locks in dtor
struct _Unlock {
  explicit _Unlock(_Lock& __lk) : _M_lock(__lk) { __lk.unlock(); }
- ~_Unlock() { _M_lock.lock(); }
+ ~_Unlock() noexcept(false)
+ {
+   if (uncaught_exception())
+ __try { _M_lock.lock(); } __catch(...) { }
+   else
+ _M_lock.lock();
+ }
  _Lock& _M_lock;
};
 
Index: testsuite/30_threads/condition_variable_any/50862.cc
===
--- testsuite/30_threads/condition_variable_any/50862.cc(revision 
180456)
+++ testsuite/30_threads/condition_variable_any/50862.cc(working copy)
@@ -4,6 +4,7 @@
 // { dg-options " -std=gnu++0x " { target *-*-cygwin *-*-darwin* } }
 // { dg-require-cstdint "" }
 // { dg-require-gthreads "" }
+// { dg-require-sched-yield "" }
  
 // Copyright (C) 2011 Free Software Foundation, Inc.
 //

[PATCH, devirtualization] Detect the new type in type change detection

2011-10-26 Thread Martin Jambor

Hi,

I've been asked by Maxim Kuvyrkov to revive the following patch which
has not made it to 4.6.  Currently, when type based devirtualization
detects a potential type change, it simply gives up on gathering any
information on the object in question.  This patch adds an attempt to
actually detect the new type after the change.

Maxim claimed this (and another patch I'll post tomorrow) noticeably
improved performance of some real code.  I can only offer a rather
artificial example in the attachment.  When the constructors are
inlined but the function multiply_matrices is not, this patch makes
the produced executable run for only 7 seconds instead of about 20 on
my 4 year old i686 desktop (with -Ofast).

Anyway, the patch passes bootstrap and testsuite on x86_64-linux.
What do you think, is it a good idea for trunk now?

Thanks,

Martin


2011-10-21  Martin Jambor  

* ipa-prop.c (type_change_info): New fields object, known_current_type
and multiple_types_encountered.
(extr_type_from_vtbl_ptr_store): New function.
(check_stmt_for_type_change): Use it, set multiple_types_encountered if
the result is different from the previous one.
(detect_type_change): Renamed to detect_type_change_1. New parameter
comp_type.  Set up new fields in tci, build known type jump
functions if the new type can be identified.
(detect_type_change): New function.
* tree.h (DECL_CONTEXT): Comment new use.

* testsuite/g++.dg/ipa/devirt-c-1.C: Add dump scans.
* testsuite/g++.dg/ipa/devirt-c-2.C: Likewise.
* testsuite/g++.dg/ipa/devirt-c-7.C: New test.


Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -271,8 +271,17 @@ ipa_print_all_jump_functions (FILE *f)
 
 struct type_change_info
 {
+  /* The declaration or SSA_NAME pointer of the base that we are checking for
+ type change.  */
+  tree object;
+  /* If we actually can tell the type that the object has changed to, it is
+ stored in this field.  Otherwise it remains NULL_TREE.  */
+  tree known_current_type;
   /* Set to true if dynamic type change has been detected.  */
   bool type_maybe_changed;
+  /* Set to true if multiple types have been encountered.  known_current_type
+ must be disregarded in that case.  */
+  bool multiple_types_encountered;
 };
 
 /* Return true if STMT can modify a virtual method table pointer.
@@ -338,6 +347,49 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
   return true;
 }
 
+/* If STMT can be proved to be an assignment to the virtual method table
+   pointer of ANALYZED_OBJ and the type associated with the new table
+   identified, return the type.  Otherwise return NULL_TREE.  */
+
+static tree
+extr_type_from_vtbl_ptr_store (gimple stmt, tree analyzed_obj)
+{
+  tree lhs, t, obj;
+
+  if (!is_gimple_assign (stmt))
+return NULL_TREE;
+
+  lhs = gimple_assign_lhs (stmt);
+
+  if (TREE_CODE (lhs) != COMPONENT_REF)
+return NULL_TREE;
+  obj = lhs;
+
+  if (!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
+return NULL_TREE;
+
+  do
+{
+  obj = TREE_OPERAND (obj, 0);
+}
+  while (TREE_CODE (obj) == COMPONENT_REF);
+  if (TREE_CODE (obj) == MEM_REF)
+obj = TREE_OPERAND (obj, 0);
+  if (TREE_CODE (obj) == ADDR_EXPR)
+obj = TREE_OPERAND (obj, 0);
+  if (obj != analyzed_obj)
+return NULL_TREE;
+
+  t = gimple_assign_rhs1 (stmt);
+  if (TREE_CODE (t) != ADDR_EXPR)
+return NULL_TREE;
+  t = get_base_address (TREE_OPERAND (t, 0));
+  if (!t || TREE_CODE (t) != VAR_DECL || !DECL_VIRTUAL_P (t))
+return NULL_TREE;
+
+  return DECL_CONTEXT (t);
+}
+
 /* Callback of walk_aliased_vdefs and a helper function for
detect_type_change to check whether a particular statement may modify
the virtual table pointer, and if possible also determine the new type of
@@ -352,6 +404,12 @@ check_stmt_for_type_change (ao_ref *ao A
 
   if (stmt_may_be_vtbl_ptr_store (stmt))
 {
+  tree type;
+  type = extr_type_from_vtbl_ptr_store (stmt, tci->object);
+  if (tci->type_maybe_changed
+ && type != tci->known_current_type)
+   tci->multiple_types_encountered = true;
+  tci->known_current_type = type;
   tci->type_maybe_changed = true;
   return true;
 }
@@ -359,19 +417,19 @@ check_stmt_for_type_change (ao_ref *ao A
 return false;
 }
 
-/* Detect whether the dynamic type of ARG has changed (before callsite CALL) by
-   looking for assignments to its virtual table pointer.  If it is, return true
-   and fill in the jump function JFUNC with relevant type information or set it
-   to unknown.  ARG is the object itself (not a pointer to it, unless
-   dereferenced).  BASE is the base of the memory access as returned by
-   get_ref_base_and_extent, as is the offset.  */
+
+
+/* Like detect_type_change but with extra argument COMP_TYPE which will become
+   the component type part of new JFUNC of d

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

.. maybe my message wasn't clear, sorry, I'm a bit tired (here it's 
late): I meant to say that the non_reference tweak fixes the 
non-template impl class case, but something more is needed for a 
template impl (thus the new testcase). And, additionally, this issue is 
a [4.6/4.7 Regression], thus, post 4.6.2, we may be interested in back 
porting something.


Let me know if you need additional details, or whatelse. Or we want me 
to look into a different way to attack the template case.


Paolo.

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini


On 10/27/2011 12:48 AM, Jason Merrill wrote:

On 10/26/2011 06:20 PM, Paolo Carlini wrote:

I re-opened this one because: 1- We may want to fix it in 4_6-branch
too, it's a regression there too; 2- We are still handling incorrectly
the template impl case. For the latter a variant of my old idea still
works, fwiw.



-object_type = TREE_TYPE (object);
+object_type = (TREE_CODE (object) == ARROW_EXPR
+   ? TREE_OPERAND (object, 0) : TREE_TYPE (object));

This is still wrong.  Why not use the same patch for 4.6 as for trunk?
I don't understand, sorry: both mainline and 4_6-branch do not handle 
correctly both testcases, template impl and not. If you want to look 
into it personally, I can unassign myself, no problem, really.


Paolo.

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill


On 10/26/2011 06:20 PM, Paolo Carlini wrote:

I re-opened this one because: 1- We may want to fix it in 4_6-branch
too, it's a regression there too; 2- We are still handling incorrectly
the template impl case. For the latter a variant of my old idea still
works, fwiw.



-   object_type = TREE_TYPE (object);
+   object_type = (TREE_CODE (object) == ARROW_EXPR
+  ? TREE_OPERAND (object, 0) : TREE_TYPE (object));


This is still wrong.  Why not use the same patch for 4.6 as for trunk?

Jason

Re: PR c++/30195

2011-10-26 Thread Jason Merrill


On 10/24/2011 05:53 PM, Fabien Chêne wrote:

After looking into it a bit more, I now see things differently. The
target_decl is in fact put into the CLASSTYPE_METHOD_VEC at
instantiation time, via handle_using_decl, with the correct access.


Great.


The problem is that while parsing the template (before instantiating),
it is not yet there and an error is immediately raised.


Ah, I see.  We need lookup_member to find the USING_DECL, but if we're 
skipping function USING_DECLs in lookup_field_1 and they aren't in 
CLASSTYPE_METHOD_VEC in templates, we don't find it either way.



I only see three solutions:
1) perform an additional lookup for using declarations that designate
functions in lookup_fnfields_slot to find something even if the
CLASSTYPE_METHOD_VEC is empty (what I implemented in the previous
patch)
2) inhibate the lookup error at some point.
3) put the decl into the CLASSTYPE_METHOD_VEC before instantiation



Honestly, I don't know if 2) is realistic, and how to implement it as well ...
3) seems complicated: in finish_member_declaration, we must put away
the decl into TYPE_FIELDS or TYPE_METHODS, but we would like to put
the target_decl into TYPE_METHODS (and call add_method), and at the
same time put its  using decl into TYPE_FIELDS... I don't know if it
can theoretically work, but having try it, I think it would need many
adjustments.


I think #3 is the way to go.  Isn't that what happens in non-templates? 
 We don't try to instantiate the CLASSTYPE_METHOD_VEC, we build up a 
whole new one from the decl list, so having it in both places shouldn't 
be a problem for instantiation.


Jason

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini


... forgot the testcase, sorry.

Paolo.

//
// PR c++/50870
// { dg-options "-std=gnu++0x" }

template 
  struct impl
  {
template  static T create();
  };

template ::template create()
		 -> impl::template create())>
struct tester { };

tester*, int, float> ti;

template ::template create()
		 -> impl::template create())>
int test() { return 0; }

int i = test*, int, float>();

Re: Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Joseph S. Myers

On Wed, 26 Oct 2011, Kai Tietz wrote:

> Hi,
> 
> patch looks fine for mingw 32-bit.  With a small nit new test passes
> for 64-bit mingw version, too.  The cygwin-part looks like the mingw
> one.

I've committed my original patch as what I can test - you may wish to 
commit followups after testing them on other Windows targets (and indeed 
more generally look at whatever Windows-specific test failures there are 
that I haven't fixed).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini


On 10/26/2011 06:27 PM, Jason Merrill wrote:

OK.
I re-opened this one because: 1- We may want to fix it in 4_6-branch 
too, it's a regression there too; 2- We are still handling incorrectly 
the template impl case. For the latter a variant of my old idea still 
works, fwiw.


Thanks,
Paolo.

/
Index: pt.c
===
--- pt.c(revision 180532)
+++ pt.c(working copy)
@@ -13706,12 +13706,13 @@ tsubst_copy_and_build (tree t,
/* Remember that there was a reference to this entity.  */
if (DECL_P (object))
  mark_used (object);
-   object_type = TREE_TYPE (object);
+   object_type = (TREE_CODE (object) == ARROW_EXPR
+  ? TREE_OPERAND (object, 0) : TREE_TYPE (object));
 
member = TREE_OPERAND (t, 1);
if (BASELINK_P (member))
  member = tsubst_baselink (member,
-   non_reference (TREE_TYPE (object)),
+   non_reference (object_type),
args, complain, in_decl);
else
  member = tsubst_copy (member, args, complain, in_decl);

Re: Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Mike Stump

On Oct 26, 2011, at 2:17 PM, Joseph S. Myers wrote:
> The test gcc.dg/graphite/run-id-1.c requires more stack space than
> Windows targets provide by default.  This patch arranges for the
> appropriate -Wl,--stack option (as in config/mh-mingw and
> config/mh-cygwin) to be used by this test on those targets.  Tested
> with cross to i686-mingw32.  OK to commit?

Yeah, I noticed these testcases (there were quite a few that hit me) on my port 
with a small stack.  I fixed it by increasing my stack space for testing, so 
that I didn't have to go hit up testcase after testcase.  I'm wondering if you 
would get better milage by just having this on the CFLAGS bits in dejagnu 
someplace, then at least, you'd never have to do this again.

I'm fine with either fix you want to put in.

Re: [PATCH][RFC] Re-write LTO option merging

2011-10-26 Thread Mike Stump

On Oct 26, 2011, at 6:10 AM, Richard Guenther wrote:
> This completely rewrites LTO option merging.

> Any comments?

Wondering if It breaks darwin?  
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50876

We have:

   %{Zmultiply_defined*:-multiply_defined %*} 

in LINK_SPEC.

and:

; Various linker options have a -Z added so that they can get to specs
; processing without interference.  Note that an option name with a
; prefix that matches another option name, that also takes an
; argument, being mapped to a -Z linker option, needs to be modified
; so the prefix is different, otherwise a '*' after the shorter option
; will match with the longer one.

multiply_defined
Driver RejectNegative Separate Alias(Zmultiply_defined)

in the .opt file.  We may well be the only port that does these sorts of 
tricks.  Essentially, we want to allow linker arguments on the command line, 
and have them never hit any machine independent option processing code, because 
they match (prefix match).  We accomplish this this by hiding them early, doing 
all the normal processing, and then way late, just before link, turning them 
back into the normal names.

Re: Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Kai Tietz

2011/10/26 Joseph S. Myers :
> The test gcc.dg/graphite/run-id-1.c requires more stack space than
> Windows targets provide by default.  This patch arranges for the
> appropriate -Wl,--stack option (as in config/mh-mingw and
> config/mh-cygwin) to be used by this test on those targets.  Tested
> with cross to i686-mingw32.  OK to commit?
>
> 2011-10-26  Joseph Myers  
>
>        * gcc.dg/graphite/run-id-1.c: Pass -Wl,--stack,12582912 option for
>        Windows targets.
>
> Index: gcc.dg/graphite/run-id-1.c
> ===
> --- gcc.dg/graphite/run-id-1.c  (revision 180200)
> +++ gcc.dg/graphite/run-id-1.c  (working copy)
> @@ -1,3 +1,4 @@
> +/* { dg-options "-Wl,--stack,12582912" { target *-*-mingw* *-*-cygwin* } } */
>  /* { dg-require-effective-target size32plus } */
>
>  void abort (void);
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

From my side patch is ok.

Thanks,
Kai

Re: Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Kai Tietz

Hi,

patch looks fine for mingw 32-bit.  With a small nit new test passes
for 64-bit mingw version, too.  The cygwin-part looks like the mingw
one.

2011/10/26 Joseph S. Myers :
> gcc.dg/tls/thr-cse-1.c tests that there are not two calls to
> __emutls_get_address in the output.  Normally this just checks for
> emutls_get_address.*emutls_get_address, but on some targets where the
> compiler output declares the functions called in some way the testcase
> has more specific tests.  i?86-*-mingw* produces a .def directive for
> ___emutls_get_address so also needs such a special case, which this
> patch adds (only for i?86-*-mingw* as I don't know what the output
> looks like for other Windows targets).  Tested with cross to
> i686-mingw32.  OK to commit?
>
> 2011-10-26  Joseph Myers  
>
>        * gcc.dg/tls/thr-cse-1.c: For i?86-*-mingw*, check for multiple
>        calls.
>
> Index: gcc.dg/tls/thr-cse-1.c
> ===
> --- gcc.dg/tls/thr-cse-1.c      (revision 180200)
> +++ gcc.dg/tls/thr-cse-1.c      (working copy)
> @@ -15,9 +15,10 @@
>   return a;
>  }
>
> -/* { dg-final { scan-assembler-not 
> "emutls_get_address.*emutls_get_address.*" { target { ! { "*-wrs-vxworks"  
> "*-*-darwin8"  "hppa*-*-hpux*" "spu-*-*" } } } } } */
> +/* { dg-final { scan-assembler-not 
> "emutls_get_address.*emutls_get_address.*" { target { ! { "*-wrs-vxworks"  
> "*-*-darwin8"  "hppa*-*-hpux*" "spu-*-*" "i?86-*-mingw*" } } } } } */
>  /* { dg-final { scan-assembler-not 
> "call\tL___emutls_get_address.stub.*call\tL___emutls_get_address.stub.*" { 
> target "*-*-darwin8" } } } */
>  /* { dg-final { scan-assembler-not "(b,l|bl) __emutls_get_address.*(b,l|bl) 
> __emutls_get_address.*" { target "hppa*-*-hpux*" } } } */
>  /* { dg-final { scan-assembler-not 
> "(brsl|brasl)\t__emutls_get_address.*(brsl|brasl)\t__emutls_get_address.*" { 
> target spu-*-* } } } */
>  /* { dg-final { scan-assembler-not "tls_lookup.*tls_lookup.*" { target 
> *-wrs-vxworks } } } */

+/* { dg-final { scan-assembler-not
"call\t___emutls_get_address.*call\t___emutls_get_address" { target {
i?86-*-mingw* i?86-*-cygwin* } } } } */
+/* { dg-final { scan-assembler-not
"call\t__emutls_get_address.*call\t__emutls_get_address" { target
"x86_64-*-mingw*" } } } */

With this change test should pass for all IA mingw-targets and for cygwin, too.

Thanks,
Kai

[RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-26 Thread Jakub Jelinek

Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.

I chose to add 4 extra (internal only) gather builtins in addition to the
16 ones needed for the intrinsics, because the builtins using different
sizes of the index vs. src/mask/ret vectors would complicate the generic
code way too much (we don't have a VEC_SELECT_EXPR nor VEC_CONCAT_EXPR
and interleaving/extract even/odd is undesirable here).
With these 4 extra builtins the generic code always sees same sized
src/mask/ret vs. index vectors, either they have same number of units,
then just one vgather* insn is needed, or the index has more elements
(int index and double/long long load) - then for one loaded index vector
there is one vgather* insn using the first half of the index vector and
one using the second half of that vector, or long index with float/int
load, then two index vectors are processed by two vgather* insns and
the result gets concatenated first halves of both results.

All this is so far unconditional only, we'd need some tree representation
of conditional loads resp. conditional stores (and could already with AVX
use vmaskmov* insns for that).

Bootstrapped/regtested on x86_64-linux and i686-linux, testcases tested
also under sde.  Ok for trunk?

2011-10-26  Jakub Jelinek  

PR tree-optimization/50789
* tree-vect-stmts.c (process_use): Add force argument, avoid
exist_non_indexing_operands_for_use_p check if true.
(vect_mark_stmts_to_be_vectorized): Adjust callers.  Handle
STMT_VINFO_GATHER_P.
(gen_perm_mask): New function.
(perm_mask_for_reverse): Use it.
(reverse_vec_element): Rename to...
(permute_vec_elements): ... this.  Add Y and MASK_VEC arguments,
generalize for any permutations.
(vectorizable_load): Adjust caller.  Handle STMT_VINFO_GATHER_P.
* target.def (TARGET_VECTORIZE_BUILTIN_GATHER): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_GATHER): Document it.
* doc/tm.texi: Regenerate.
* tree-data-ref.c (initialize_data_dependence_relation,
compute_self_dependence): No longer static.
* tree-data-ref.h (initialize_data_dependence_relation,
compute_self_dependence): New prototypes.
* tree-vect-data-refs.c (vect_check_gather): New function.
(vect_analyze_data_refs): Detect possible gather load data
refs.
* tree-vectorizer.h (struct _stmt_vec_info): Add gather_p field.
(STMT_VINFO_GATHER_P): Define.
(vect_check_gather): New prototype.
* config/i386/i386-builtin-types.def: Add types for alternate
gather builtins.
* config/i386/sse.md (AVXMODE48P_DI): Remove.
(VEC_GATHER_MODE): Rename mode_attr to...
(VEC_GATHER_IDXSI): ... this.
(VEC_GATHER_IDXDI, VEC_GATHER_SRCDI): New mode_attrs.
(avx2_gathersi, *avx2_gathersi): Use 
instead of .
(avx2_gatherdi): Use  instead of
< and  instead of VEC_GATHER_MODE
on src and mask operands.
(*avx2_gatherdi): Likewise.  Use VEC_GATHER_MODE iterator
instead of AVXMODE48P_DI.
(avx2_gatherdi256, *avx2_gatherdi256): Removed.
* config/i386/i386.c (enum ix86_builtins): Add
IX86_BUILTIN_GATHERALTSIV4DF, IX86_BUILTIN_GATHERALTDIV8SF,
IX86_BUILTIN_GATHERALTSIV4DI and IX86_BUILTIN_GATHERALTDIV8SI.
(ix86_init_mmx_sse_builtins): Create those builtins.
(ix86_expand_builtin): Handle those builtins and adjust expansions
of other gather builtins.
(ix86_vectorize_builtin_gather): New function.
(TARGET_VECTORIZE_BUILTIN_GATHER): Define.

* gcc.target/i386/avx2-gather-1.c: New test.
* gcc.target/i386/avx2-gather-2.c: New test.
* gcc.target/i386/avx2-gather-3.c: New test.

--- gcc/tree-vect-stmts.c.jj2011-10-26 14:19:11.0 +0200
+++ gcc/tree-vect-stmts.c   2011-10-26 16:54:23.0 +0200
@@ -332,6 +332,8 @@ exist_non_indexing_operands_for_use_p (t
- LIVE_P, RELEVANT - enum values to be set in the STMT_VINFO of the stmt
  that defined USE.  This is done by calling mark_relevant and passing it
  the WORKLIST (to add DEF_STMT to the WORKLIST in case it is relevant).
+   - FORCE is true if exist_non_indexing_operands_for_use_p check shouldn't
+ be performed.
 
Outputs:
Generally, LIVE_P and RELEVANT are used to define the liveness and
@@ -351,7 +353,8 @@ exist_non_indexing_operands_for_use_p (t
 
 static bool
 process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p,
-enum vect_relevant relevant, VEC(gimple,heap) **worklist)
+enum vect_relevant relevant, VEC(gimple,heap) **worklist,
+bool force)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   st

Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini


On 10/26/2011 10:35 PM, Paolo Carlini wrote:

On 10/26/2011 10:30 PM, Paolo Carlini wrote:

Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after ->.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b->A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..
We have that parser->scope is a RECORD_TYPE and postfix_expression is 
an INDIRECT_REF.

In this case, for example (like PR50870):

namespace impl
{
  struct inner
  {
template  T create();
  };
}

template () -> impl::inner::create())>
struct foo;

we are also Ok, code is accepted, because name is a BASELINK and the new 
check isn't even reached (postfix_expression would be an ARROW_EXPR, but 
parser->scope again a RECORD_TYPE. More generally, in all the legal 
tests I tried by hand (outside the testsuite), when we get there 
parser->scope is always a RECORD_TYPE)


But if you feel more comfortable about performing the check elsewhere, I 
can try that of course.


Paolo.

Re: Skip gcc.target/i386/pad-4.c for Windows targets

2011-10-26 Thread Mike Stump

On Oct 26, 2011, at 2:18 PM, Joseph S. Myers wrote:
> gcc.target/i386/pad-4.c expects particular results from PIC code
> generation - PIC in the sense of a GOT and a PLT, which is different
> from the sense in which Windows code is always PIC.  This patch
> arranges for it to be skipped on Windows targets, similarly to other
> tests such as gcc.target/i386/pic-1.c.  Tested with cross to
> i686-mingw32.  OK to commit?

Ok.  This one makes my head hurt, just a little.

Re: Fix gcc.target/i386/pr37843-4.c for targets using leading underscores

2011-10-26 Thread Mike Stump

On Oct 26, 2011, at 2:18 PM, Joseph S. Myers wrote:
> gcc.target/i386/pr37843-4.c tests for the form calls to a function
> "foo" take in the output.  On Windows targets this function has a
> leading underscore and so the test fails.

> OK to commit?

Ok.

[PATCH] Canonicalize sparc movcc patterns such that operand 0 always appears in operand 4.

2011-10-26 Thread David Miller


The background is that I was putting together some test cases for
gcc.target/sparc that make sure the most optimal setcc sequences
are being generated.

When v9, and thus conditional moves, are enabled we sometimes fall
back to conditional moves even when the addx/subx sequence is more
efficient.  For example, we can do "a = x < y" in 2 instructions when
x and y are both unsigned:

cmp y, x
subx%g0, -1, a

but with v9 enabled we get:

cmp x, y
mov 0, a
movleu  %icc, 1, a

The reason is because of the permissive way in which we define the RTL
for conditional moves.

For the integer moves we mark both operands 3 and 4 using predicate
"arith*_operand" which accepts constants as well as registers.  The
constraints do in fact express how one of the two operands must be
equal the operand 0. However combine doesn't look at constraints to
determine if a combined instruction is recognized.

So combine thinks that insns of the form:

(set op0 (if_then_else (CMP X Y)
   (const_int 0)
   (const_int 1)))

are ok.  Reload later fixes things up by reloading one of the two
constants into operand 0, and that's how we end up with the above 3
instruction sequence.

Initially I tried to fix up the conditional move patterns by using a
"match_dup 0" for operand 4 to show the combiner, in the RTL, that
operand 4 must be equal to operand 0.

But I learned that this is not legal RTL.  You can only use match_dup
in instructions after the first instruction of an expansion that
initially makes use of the operand.  Surprisingly this worked most of
the time. The failures I got were in a regrename def/use chain
assertion check when loop unrolling was enabled.

Anyways, instead what I do here is normalize all expansions of
conditional moves to be of the form:

(set op0 (if_then_else (cmp X Y)
   op3
   op0))

and in the instruction patterns I use "register_operand" and
constraint "0" for operand 4.

This is enough to keep the combiner from recognizing sequences
like the above.

Committed to trunk.

gcc/

* config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
* config/sparc/sparc.md (movcc, movcc): Call it.
(*mov_cc_v9): Normalize to expect operand 0 always in operand 4.
(*mov_cc_reg_sp64): Likewise.
(*movsf_cc_v9): Likewise.
(*movsf_cc_reg_sp64): Likewise.
(*movdf_cc_v9): Likewise.
(*movdf_cc_reg_sp64): Likewise.
(*movtf_cc_hq_v9): Likewise.
(*movtf_cc_reg_hq_sp64): Likewise.
(*movtf_cc_v9): Likewise.
(*movtf_cc_reg_sp64): Likewise.
* config/sparc/sparc.c (sparc_expand_conditional_move): New function.
(sparc_print_operand): Delete 'c' and 'd' handling, no longer used.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180542 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog   |   17 
 gcc/config/sparc/sparc-protos.h |1 +
 gcc/config/sparc/sparc.c|   80 
 gcc/config/sparc/sparc.md   |  197 +--
 4 files changed, 144 insertions(+), 151 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cd1a236..d6d1382 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@
+2011-10-26  David S. Miller  
+
+   * config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
+   * config/sparc/sparc.md (movcc, movcc): Call it.
+   (*mov_cc_v9): Normalize to expect operand 0 always in operand 4.
+   (*mov_cc_reg_sp64): Likewise.
+   (*movsf_cc_v9): Likewise.
+   (*movsf_cc_reg_sp64): Likewise.
+   (*movdf_cc_v9): Likewise.
+   (*movdf_cc_reg_sp64): Likewise.
+   (*movtf_cc_hq_v9): Likewise.
+   (*movtf_cc_reg_hq_sp64): Likewise.
+   (*movtf_cc_v9): Likewise.
+   (*movtf_cc_reg_sp64): Likewise.
+   * config/sparc/sparc.c (sparc_expand_conditional_move): New function.
+   (sparc_print_operand): Delete 'c' and 'd' handling, no longer used.
+
 2011-10-26  Eric Botcazou  
 
* reload.c (reload_inner_reg_of_subreg): Change type of return value
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index bb6fb07..108e105 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -107,6 +107,7 @@ extern void sparc_expand_compare_and_swap_12 (rtx, rtx, 
rtx, rtx);
 extern const char *output_v8plus_mult (rtx, rtx *, const char *);
 extern void sparc_expand_vector_init (rtx, rtx);
 extern void sparc_expand_vec_perm_bmask(enum machine_mode, rtx);
+extern bool sparc_expand_conditional_move (enum machine_mode, rtx *);
 #endif /* RTX_CODE */
 
 #endif /* __SPARC_PROTOS_H__ */
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index f806c6c..964bcaf 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -81

Fix gcc.target/i386/pr37843-4.c for targets using leading underscores

2011-10-26 Thread Joseph S. Myers

gcc.target/i386/pr37843-4.c tests for the form calls to a function
"foo" take in the output.  On Windows targets this function has a
leading underscore and so the test fails.  This patch allows for that
underscore in the test.  Tested with cross to i686-mingw32.  OK to
commit?

2011-10-26  Joseph Myers  

* gcc.target/i386/pr37843-4.c: Allow for leading underscores on
symbol names.

Index: gcc.target/i386/pr37843-4.c
===
--- gcc.target/i386/pr37843-4.c (revision 180200)
+++ gcc.target/i386/pr37843-4.c (working copy)
@@ -3,8 +3,8 @@
 /* { dg-options "-O2 -msse2 -mpreferred-stack-boundary=4 -mstackrealign" } */
 /* { dg-require-effective-target sse2 } */
 /* { dg-final { scan-assembler-not "andl\[\\t \]*\\$-16,\[\\t \]*%\[re\]?sp" } 
} */
-/* { dg-final { scan-assembler-not "call\[\\t \]*foo" } } */
-/* { dg-final { scan-assembler "jmp\[\\t \]*foo" } } */
+/* { dg-final { scan-assembler-not "call\[\\t \]*_?foo" } } */
+/* { dg-final { scan-assembler "jmp\[\\t \]*_?foo" } } */
 
 extern int foo (void);
 

-- 
Joseph S. Myers
jos...@codesourcery.com

Skip gcc.target/i386/pad-4.c for Windows targets

2011-10-26 Thread Joseph S. Myers

gcc.target/i386/pad-4.c expects particular results from PIC code
generation - PIC in the sense of a GOT and a PLT, which is different
from the sense in which Windows code is always PIC.  This patch
arranges for it to be skipped on Windows targets, similarly to other
tests such as gcc.target/i386/pic-1.c.  Tested with cross to
i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  

* gcc.target/i386/pad-4.c: Skip for Windows targets.

Index: gcc.target/i386/pad-4.c
===
--- gcc.target/i386/pad-4.c (revision 180200)
+++ gcc.target/i386/pad-4.c (working copy)
@@ -2,6 +2,7 @@
 /* { dg-require-effective-target ia32 } */
 /* { dg-require-effective-target fpic } */
 /* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=atom" } } 
*/
+/* { dg-skip-if "No Windows PIC" { *-*-mingw* *-*-cygwin } { "*" } { "" } } */
 /* { dg-options "-O2 -fomit-frame-pointer -march=atom -fPIC" } */
 /* { dg-final { scan-assembler-times "nop" 8 } } */
 /* { dg-final { scan-assembler-not "rep" } } */

-- 
Joseph S. Myers
jos...@codesourcery.com

Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Joseph S. Myers

gcc.dg/tls/thr-cse-1.c tests that there are not two calls to
__emutls_get_address in the output.  Normally this just checks for
emutls_get_address.*emutls_get_address, but on some targets where the
compiler output declares the functions called in some way the testcase
has more specific tests.  i?86-*-mingw* produces a .def directive for
___emutls_get_address so also needs such a special case, which this
patch adds (only for i?86-*-mingw* as I don't know what the output
looks like for other Windows targets).  Tested with cross to
i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  

* gcc.dg/tls/thr-cse-1.c: For i?86-*-mingw*, check for multiple
calls.

Index: gcc.dg/tls/thr-cse-1.c
===
--- gcc.dg/tls/thr-cse-1.c  (revision 180200)
+++ gcc.dg/tls/thr-cse-1.c  (working copy)
@@ -15,9 +15,10 @@
   return a;
 }
 
-/* { dg-final { scan-assembler-not "emutls_get_address.*emutls_get_address.*" 
{ target { ! { "*-wrs-vxworks"  "*-*-darwin8"  "hppa*-*-hpux*" "spu-*-*" } } } 
} } */
+/* { dg-final { scan-assembler-not "emutls_get_address.*emutls_get_address.*" 
{ target { ! { "*-wrs-vxworks"  "*-*-darwin8"  "hppa*-*-hpux*" "spu-*-*" 
"i?86-*-mingw*" } } } } } */
 /* { dg-final { scan-assembler-not 
"call\tL___emutls_get_address.stub.*call\tL___emutls_get_address.stub.*" { 
target "*-*-darwin8" } } } */
 /* { dg-final { scan-assembler-not "(b,l|bl) __emutls_get_address.*(b,l|bl) 
__emutls_get_address.*" { target "hppa*-*-hpux*" } } } */
 /* { dg-final { scan-assembler-not 
"(brsl|brasl)\t__emutls_get_address.*(brsl|brasl)\t__emutls_get_address.*" { 
target spu-*-* } } } */
 /* { dg-final { scan-assembler-not "tls_lookup.*tls_lookup.*" { target 
*-wrs-vxworks } } } */
+/* { dg-final { scan-assembler-not 
"call\t___emutls_get_address.*call\t___emutls_get_address" { target 
"i?86-*-mingw*" } } } */
 

-- 
Joseph S. Myers
jos...@codesourcery.com

Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Joseph S. Myers

The test gcc.dg/graphite/run-id-1.c requires more stack space than
Windows targets provide by default.  This patch arranges for the
appropriate -Wl,--stack option (as in config/mh-mingw and
config/mh-cygwin) to be used by this test on those targets.  Tested
with cross to i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  

* gcc.dg/graphite/run-id-1.c: Pass -Wl,--stack,12582912 option for
Windows targets.

Index: gcc.dg/graphite/run-id-1.c
===
--- gcc.dg/graphite/run-id-1.c  (revision 180200)
+++ gcc.dg/graphite/run-id-1.c  (working copy)
@@ -1,3 +1,4 @@
+/* { dg-options "-Wl,--stack,12582912" { target *-*-mingw* *-*-cygwin* } } */
 /* { dg-require-effective-target size32plus } */
 
 void abort (void);

-- 
Joseph S. Myers
jos...@codesourcery.com

[trans-mem] fix problems with same body aliases

2011-10-26 Thread Aldy Hernandez

The merge broke base/complete dtor transactional clones that we 
originally implemented here:


http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00590.html

First, ipa_tm_execute() was putting __comp_dtor's into tm_callees, which 
wasn't happening pre-merge.  Handling this __comp_dtor caused an ICE 
tree_function_versioning() because ENTRY_BLOCK_PTR_FOR_FUNCTION is 
invalid, since the CFG field for the original decl's cfun has not been set.


The patch below excludes such aliases from tm_callees since we can't 
handle them in ipa_tm_create_version.


Second, ipa_tm_create_version() marks aliases of the original decl as 
needed in callback_mark_needed().  Unfortunately, this "needed" bit does 
not persist until the end of compilation because whole program IPA will 
remove the bit here:


  /* Frontends and alias code marks nodes as needed before parsing 
is finished.
	 We may end up marking as node external nodes where this flag is 
meaningless

 strip it.  */
  if (node->needed
  && (DECL_EXTERNAL (node->decl) || !node->analyzed))
node->needed = 0;

Consequently, the alias will not show up in the .tm_clone_table.

The patch below sets the "analyzed" bit to keep this from happening. 
With it, g++.dg/tm/alias.C works again.


No regressions elsewhere.

OK?
* trans-mem.c (ipa_tm_execute): Do not include aliases in
tm_callees.
(callback_mark_needed): Set analyzed bit.

Index: trans-mem.c
===
--- trans-mem.c (revision 180439)
+++ trans-mem.c (working copy)
@@ -4137,7 +4137,13 @@ callback_mark_needed (struct cgraph_node
   record_tm_clone_pair (node->decl, tm_alias);
 
   if (info->old_node->needed)
-   cgraph_mark_needed_node (cgraph_get_node (tm_alias));
+   {
+ struct cgraph_node *alias = cgraph_get_node (tm_alias);
+ cgraph_mark_needed_node (alias);
+ /* Needed so function_and_variable_visibility() won't reset
+the needed bit.  */
+ alias->analyzed = 1;
+   }
 }
   return false;
 }
@@ -4592,6 +4598,7 @@ ipa_tm_execute (void)
   /* For all local functions marked tm_callable, queue them.  */
   for (node = cgraph_nodes; node; node = node->next)
 if (is_tm_callable (node->decl)
+   && !node->alias
&& cgraph_function_body_availability (node) >= AVAIL_OVERWRITABLE)
   {
d = get_cg_data (node);

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Paul Richard Thomas

Steve,

> Surely, you have Halloween across the Pond, ie., the Grim Reaper. :-)

And what, pray, does the Grim Reaper hold???

The patch is OK.

Thanks

Paul

>
>>
>> On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
>>  wrote:
>> > On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
>> >> The attach patch reaps some code that is now dead
>>
>> I am sure that you are right but could you confirm that is is because
>> of previous patches?
>
> The tests in simplify.c can never be triggered because these
> are already caught in check.c.  Consider,
>
>  program a
>  integer i
>  i = 12234
>  print *, ibclr(i, 123)
>  end program a
>
> laptop:kargl[207] gfc4x -o z a.f90
> a.f90:4.20:
>
>  print *, ibclr(i, 123)
>                    1
> Error: 'pos' at (1) must be less than BIT_SIZE('i')
>
> This error message comes from check.c(gfc_check_bitfcn), and
> it is the error message I get with 4.5.x without my dead.diff
> patch.  The relevant lines of code in gfc_check_bitfcn are
>
>  if (nonnegative_check ("pos", pos) == FAILURE)
>    return FAILURE;
>
>  if (less_than_bitsize1 ("i", i, "pos", pos, false) == FAILURE)
>    return FAILURE;
>
> This is the chunk of code I removed from simplify.c(gfc_simplify_ibclr)
>
> -  if (gfc_extract_int (y, &pos) != NULL || pos < 0)
> -    {
> -      gfc_error ("Invalid second argument of IBCLR at %L", &y->where);
> -      return &gfc_bad_expr;
> -    }
> +  gfc_extract_int (y, &pos);
>
> The if-statement can never be true.  In fact, for the above Fortran
> code, gfc_simplify_ibclr is never executed beyonds its first line:
>
>  if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT)
>    return NULL;
>
> Here, x is 'i' from the program and y is 'pos'.  So, both arguments
> to ibclr must be constant for the code that I removed to even have
> a chance to execute.
>
> The other chunks of code I removed are similar in nature.  The
> checking done by code in check.c makes the checking done in
> simplify.c useless bloat.  I'm just cleaning up the mess left
> behind the person, who introduced gfc_check_bitfcn.
>
> laptop:kargl[217] svn log -r 160492 check.c |more
> 
> r160492 | kargl | 2010-06-09 09:24:59 -0700 (Wed, 09 Jun 2010) | 23 lines
>
> 2010-06-09  Steven G. Kargl  
>
>        * fortran/intrinsic.c (add_functions): Change gfc_check_btest,
>        gfc_check_ibclr, and gfc_check_ibset to gfc_check_bitfcn.
>        * fortran/intrinsic.h: Remove prototypes for gfc_check_btest,
>        gfc_check_ibclr, and gfc_check_ibset.  Add prototype for
>        gfc_check_bitfcn.
>        * fortran/check.c (nonnegative_check, less_than_bitsize1,
>        less_than_bitsize2): New functions.
>        (gfc_check_btest): Renamed to gfc_check_bitfcn.  Use
>        nonnegative_check and less_than_bitsize1.
>        (gfc_check_ibclr, gfc_check_ibset): Removed.
>        (gfc_check_ibits,gfc_check_mvbits): Use nonnegative_check and
>        less_than_bitsize1.
>
>> >> due to my recent changes for ishftc in check.c.
>> >> Regression tested on i686-*-freebsd.
>>
>> It looks to be OK, subject to the above, and is, I would guess,
>> "obvious" in any case.
>>
>> Thanks for the patch
>
> My regression test on i686-*-freebsd.
>
>                === gfortran Summary ===
>
> # of expected passes            39552
> # of unexpected failures        8
> # of unexpected successes       16
> # of expected failures          41
> # of unsupported tests          212
>
> The 8 failures are due to entry_4.f90 and select_type_12.f03.
>
> --
> Steve
>



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy

[Ada] Internal error on elaboration variable with -gnatct

2011-10-26 Thread Eric Botcazou

This is an internal error on the elaboration variable generated for an array of 
discriminated record type whose discriminant is constraint by a dynamic upper 
bound in -gnatct mode.  In this mode, the compiler doesn't generate code and 
gigi is invoked only to lay out types and back-annotate type information.

Tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  

* gcc-interface/decl.c (elaborate_expression_1): Add EXPR_PUBLIC_P local
variable.  Always create the elaboration variable, if any, as constant.


2011-10-26  Eric Botcazou  

* gnat.dg/specs/discr1.ads: New test.
* gnat.dg/specs/discr1_pkg.ads: New helper.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180540)
+++ gcc-interface/decl.c	(working copy)
@@ -6085,7 +6085,8 @@ static tree
 elaborate_expression_1 (tree gnu_expr, Entity_Id gnat_entity, tree gnu_name,
 			bool definition, bool need_debug)
 {
-  const bool expr_global_p = Is_Public (gnat_entity) || global_bindings_p ();
+  const bool expr_public_p = Is_Public (gnat_entity);
+  const bool expr_global_p = expr_public_p || global_bindings_p ();
   bool expr_variable_p, use_variable;
 
   /* In most cases, we won't see a naked FIELD_DECL because a discriminant
@@ -6153,11 +6154,10 @@ elaborate_expression_1 (tree gnu_expr, E
   if (use_variable || need_debug)
 {
   tree gnu_decl
-	= create_var_decl (create_concat_name (gnat_entity,
-	   IDENTIFIER_POINTER (gnu_name)),
-			   NULL_TREE, TREE_TYPE (gnu_expr), gnu_expr,
-			   !need_debug, Is_Public (gnat_entity),
-			   !definition, expr_global_p, NULL, gnat_entity);
+	= create_var_decl_1
+	  (create_concat_name (gnat_entity, IDENTIFIER_POINTER (gnu_name)),
+	   NULL_TREE, TREE_TYPE (gnu_expr), gnu_expr, true, expr_public_p,
+	   !definition, expr_global_p, !need_debug, NULL, gnat_entity);
 
   if (use_variable)
 	return gnu_decl;
-- { dg-do compile }
-- { dg-options "-gnatct" }

with Discr1_Pkg; use Discr1_Pkg;

package Discr1 is

  procedure Proc (V : Variable_String_Array);

end Discr1;
package Discr1_Pkg is

  Maximum_Length : Natural := 80 ;

  subtype String_Length is Natural range 0 .. Maximum_Length;

  type Variable_String (Length : String_Length := 0) is
record
  S : String (1 .. Length);
end record;

  type Variable_String_Array is array (Natural range <>) of Variable_String;

end Discr1_Pkg;

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Steve Kargl

On Wed, Oct 26, 2011 at 09:55:09PM +0200, Paul Richard Thomas wrote:
> Dear Steve,
> 
> Reaping implies that there is something about it that you want to keep
> :-)  Surely, weeding or herbicide spraying is better than reaping?

Surely, you have Halloween across the Pond, ie., the Grim Reaper. :-)

> 
> On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
>  wrote:
> > On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
> >> The attach patch reaps some code that is now dead
> 
> I am sure that you are right but could you confirm that is is because
> of previous patches?

The tests in simplify.c can never be triggered because these
are already caught in check.c.  Consider,

  program a
  integer i
  i = 12234
  print *, ibclr(i, 123)
  end program a

laptop:kargl[207] gfc4x -o z a.f90
a.f90:4.20:

  print *, ibclr(i, 123)
1
Error: 'pos' at (1) must be less than BIT_SIZE('i')

This error message comes from check.c(gfc_check_bitfcn), and
it is the error message I get with 4.5.x without my dead.diff
patch.  The relevant lines of code in gfc_check_bitfcn are

  if (nonnegative_check ("pos", pos) == FAILURE)
return FAILURE;

  if (less_than_bitsize1 ("i", i, "pos", pos, false) == FAILURE)
return FAILURE;

This is the chunk of code I removed from simplify.c(gfc_simplify_ibclr) 

-  if (gfc_extract_int (y, &pos) != NULL || pos < 0)
-{
-  gfc_error ("Invalid second argument of IBCLR at %L", &y->where);
-  return &gfc_bad_expr;
-}
+  gfc_extract_int (y, &pos);

The if-statement can never be true.  In fact, for the above Fortran
code, gfc_simplify_ibclr is never executed beyonds its first line:

  if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT)
return NULL;

Here, x is 'i' from the program and y is 'pos'.  So, both arguments
to ibclr must be constant for the code that I removed to even have
a chance to execute.

The other chunks of code I removed are similar in nature.  The 
checking done by code in check.c makes the checking done in
simplify.c useless bloat.  I'm just cleaning up the mess left
behind the person, who introduced gfc_check_bitfcn.

laptop:kargl[217] svn log -r 160492 check.c |more

r160492 | kargl | 2010-06-09 09:24:59 -0700 (Wed, 09 Jun 2010) | 23 lines

2010-06-09  Steven G. Kargl  

* fortran/intrinsic.c (add_functions): Change gfc_check_btest,
gfc_check_ibclr, and gfc_check_ibset to gfc_check_bitfcn.
* fortran/intrinsic.h: Remove prototypes for gfc_check_btest,
gfc_check_ibclr, and gfc_check_ibset.  Add prototype for
gfc_check_bitfcn.
* fortran/check.c (nonnegative_check, less_than_bitsize1, 
less_than_bitsize2): New functions.
(gfc_check_btest): Renamed to gfc_check_bitfcn.  Use
nonnegative_check and less_than_bitsize1.
(gfc_check_ibclr, gfc_check_ibset): Removed.
(gfc_check_ibits,gfc_check_mvbits): Use nonnegative_check and
less_than_bitsize1.

> >> due to my recent changes for ishftc in check.c.
> >> Regression tested on i686-*-freebsd.
> 
> It looks to be OK, subject to the above, and is, I would guess,
> "obvious" in any case.
> 
> Thanks for the patch

My regression test on i686-*-freebsd.

=== gfortran Summary ===

# of expected passes39552
# of unexpected failures8
# of unexpected successes   16
# of expected failures  41
# of unsupported tests  212

The 8 failures are due to entry_4.f90 and select_type_12.f03.

-- 
Steve

[Ada] Internal error on unchecked union and representation clause

2011-10-26 Thread Eric Botcazou

The compiler aborts on a record type which contains a component of an unchecked 
union type whose size is smaller than 64 bits and which is subject to a 
representation clause that causes it not to start on a byte boundary.

We already have the machinery to handle this kind of composite component, but 
it was enabled only for regular record types.

Tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Try to
make a packable type for fields of union types as well.
: Use RECORD_OR_UNION_TYPE_P predicate.
(gnat_to_gnu_component_type): Try to make a packable type for fields
of union types as well.
(make_packable_type): Use RECORD_OR_UNION_TYPE_P predicate.
(maybe_pad_type): Try to make a packable type for fields of union types
as well.
(gnat_to_gnu_field): Likewise.
(is_variable_size): Use RECORD_OR_UNION_TYPE_P predicate.
(set_rm_size): Likewise.
(rm_size): Likewise.
* gcc-interface/misc.c (gnat_type_max_size): Likewise.
* gcc-interface/trans.c (add_decl_expr): Likewise.
* gcc-interface/utils.c (finish_record_type): Likewise.
* gcc-interface/utils2.c (build_simple_component_ref): Likewise.


2011-10-26  Eric Botcazou  

* gnat.dg/specs/unchecked_union.ads: Rename to...
* gnat.dg/specs/unchecked_union1.ads: ...this.
* gnat.dg/specs/unchecked_union2.ads: New test.


-- 
Eric Botcazou
-- { dg-do compile }

package Unchecked_Union2 is

   type Small_Int is range 0 .. 2**19 - 1;

   type R1 (B : Boolean := True) is record
  case B is
 when True  => Data1 : Small_Int;
 when False => Data2 : Small_Int;
  end case;
   end record;

   for R1 use record
  Data1 at 0 range 0 .. 18;
  Data2 at 0 range 0 .. 18;
   end record;
   for R1'Size use 24;

   pragma Unchecked_Union (R1);

   type R2 is record
 Data : R1;
   end record;

   for R2 use record
 Data at 0 range 3 .. 26;
   end record;

end Unchecked_Union2;
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180539)
+++ gcc-interface/decl.c	(working copy)
@@ -3302,7 +3302,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 			== INTEGER_CST)
 		  {
 			gnu_size = DECL_SIZE (gnu_old_field);
-			if (TREE_CODE (gnu_field_type) == RECORD_TYPE
+			if (RECORD_OR_UNION_TYPE_P (gnu_field_type)
 			&& !TYPE_FAT_POINTER_P (gnu_field_type)
 			&& host_integerp (TYPE_SIZE (gnu_field_type), 1))
 			  gnu_field_type
@@ -4645,13 +4645,11 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  tree size;
 
 	  /* If a size was specified, take it into account.  Otherwise
-		 use the RM size for records as the type size has already
-		 been adjusted to the alignment.  */
+		 use the RM size for records or unions as the type size has
+		 already been adjusted to the alignment.  */
 	  if (gnu_size)
 		size = gnu_size;
-	  else if ((TREE_CODE (gnu_type) == RECORD_TYPE
-			|| TREE_CODE (gnu_type) == UNION_TYPE
-			|| TREE_CODE (gnu_type) == QUAL_UNION_TYPE)
+	  else if (RECORD_OR_UNION_TYPE_P (gnu_type)
 		   && !TYPE_FAT_POINTER_P (gnu_type))
 		size = rm_size (gnu_type);
 	  else
@@ -5300,7 +5298,7 @@ gnat_to_gnu_component_type (Entity_Id gn
   && !Is_Bit_Packed_Array (gnat_array)
   && !Has_Aliased_Components (gnat_array)
   && !Strict_Alignment (gnat_type)
-  && TREE_CODE (gnu_type) == RECORD_TYPE
+  && RECORD_OR_UNION_TYPE_P (gnu_type)
   && !TYPE_FAT_POINTER_P (gnu_type)
   && host_integerp (TYPE_SIZE (gnu_type), 1))
 gnu_type = make_packable_type (gnu_type, false);
@@ -6357,9 +6355,7 @@ make_packable_type (tree type, bool in_r
   tree new_field_type = TREE_TYPE (old_field);
   tree new_field, new_size;
 
-  if ((TREE_CODE (new_field_type) == RECORD_TYPE
-	   || TREE_CODE (new_field_type) == UNION_TYPE
-	   || TREE_CODE (new_field_type) == QUAL_UNION_TYPE)
+  if (RECORD_OR_UNION_TYPE_P (new_field_type)
 	  && !TYPE_FAT_POINTER_P (new_field_type)
 	  && host_integerp (TYPE_SIZE (new_field_type), 1))
 	new_field_type = make_packable_type (new_field_type, true);
@@ -6369,9 +6365,7 @@ make_packable_type (tree type, bool in_r
 	 packable version of the record type, see finish_record_type.  */
   if (!DECL_CHAIN (old_field)
 	  && !TYPE_PACKED (type)
-	  && (TREE_CODE (new_field_type) == RECORD_TYPE
-	  || TREE_CODE (new_field_type) == UNION_TYPE
-	  || TREE_CODE (new_field_type) == QUAL_UNION_TYPE)
+	  && RECORD_OR_UNION_TYPE_P (new_field_type)
 	  && !TYPE_FAT_POINTER_P (new_field_type)
 	  && !TYPE_CONTAINS_TEMPLATE_P (new_field_type)
 	  && TYPE_ADA_SIZE (new_field_type))
@@ -6533,7 +6527,7 @@ maybe_pad_type (tree type, tree size, un
  between them and it might be hard to overcome afterwards, including
  at the RTL level when the stand-alone

Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini


On 10/26/2011 10:30 PM, Paolo Carlini wrote:

Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after ->.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b->A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..
We have that parser->scope is a RECORD_TYPE and postfix_expression is an 
INDIRECT_REF.


Paolo.

[Ada] Missing error for invalid atomic component

2011-10-26 Thread Eric Botcazou

The compiler should issue the "atomic access cannot be guaranteed" error twice 
on the attached testcase, but it only issues it for the stand-alone variable.

Fixed thusly, tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_field): Always check components
declared as atomic.  Move around conditionally executed code.


2011-10-26  Eric Botcazou  

* gnat.dg/specs/atomic1.ads: New test.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180423)
+++ gcc-interface/decl.c	(working copy)
@@ -6853,10 +6853,8 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 	}
 }
 
-  /* If we are packing the record and the field is BLKmode, round the
- size up to a byte boundary.  */
-  if (packed && TYPE_MODE (gnu_field_type) == BLKmode && gnu_size)
-gnu_size = round_up (gnu_size, BITS_PER_UNIT);
+  if (Is_Atomic (gnat_field))
+check_ok_for_atomic (gnu_field_type, gnat_field, false);
 
   if (Present (Component_Clause (gnat_field)))
 {
@@ -6946,9 +6944,6 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 	  gnu_pos = NULL_TREE;
 	}
 	}
-
-  if (Is_Atomic (gnat_field))
-	check_ok_for_atomic (gnu_field_type, gnat_field, false);
 }
 
   /* If the record has rep clauses and this is the tag field, make a rep
@@ -6961,7 +6956,14 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 }
 
   else
-gnu_pos = NULL_TREE;
+{
+  gnu_pos = NULL_TREE;
+
+  /* If we are packing the record and the field is BLKmode, round the
+	 size up to a byte boundary.  */
+  if (packed && TYPE_MODE (gnu_field_type) == BLKmode && gnu_size)
+	gnu_size = round_up (gnu_size, BITS_PER_UNIT);
+}
 
   /* We need to make the size the maximum for the type if it is
  self-referential and an unconstrained type.  In that case, we can't
-- { dg-do compile }

package Atomic1 is

  type Arr is array (Integer range <>) of Boolean;
  type UA is access all Arr;

  U : UA;
  pragma Atomic (U);  -- { dg-error "atomic access" }

  type R is record
U : UA;
pragma Atomic (U);  -- { dg-error "atomic access" }
  end record;

end Atomic1;

Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini


Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after ->.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b->A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..


Paolo.

Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Jason Merrill


On 10/26/2011 04:04 PM, Paolo Carlini wrote:

The below tries to catch the problem very early, in
cp_parser_postfix_dot_deref_expression and apparently works fine, passes
the testsuite, etc. Is it too early? Is the check tight enough?


At a glance, it looks too early; it's valid to have namespace-qualified 
names after ->.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b->A::B::i;

Jason

Re: [C++-11] User defined literals

2011-10-26 Thread Jason Merrill


On 10/26/2011 04:11 PM, 3dw...@verizon.net wrote:

Thank you Jason and Tom for your help in getting this together an putting up 
with my slowness.


Thanks for all your work on this!

Jason

Re: Use of vector instructions in memmov/memset expanding

2011-10-26 Thread Michael Zolotukhin

Any questions on these patches? Are they ok for the trunk?

On 20 October 2011 12:37, Michael Zolotukhin
 wrote:
> And, finally, part with the tests.
>
> On 20 October 2011 12:36, Michael Zolotukhin
>  wrote:
>> Back-end part of the patch is attached here.
>>
>> On 20 October 2011 12:35, Michael Zolotukhin
>>  wrote:
>>> Middle-end part of the patch is attached.
>>>
>>> On 20 October 2011 12:34, Michael Zolotukhin
>>>  wrote:
 I fixed the tests as well as updated my branch and fixed introduced
 during this process bugs.
 Here is fixed complete patch (other parts will be sent in consequent 
 letters).

 The changes passed bootstrap and make check.

 On 29 September 2011 15:21, Jakub Jelinek  wrote:
> Hi!
>
> On Thu, Sep 29, 2011 at 03:14:40PM +0400, Michael Zolotukhin wrote:
> +/* { dg-options "-O2 -march=atom -mtune=atom -m64 -dp" } */
>
> The testcases are wrong, -m64 or -m32 should never appear in dg-options,
> instead if the testcase is specific to -m64, it should be guarded with
> /* { dg-do compile { target lp64 } } */
> resp. ia32 (or ilp32, depending on what exactly should be done for -mx32),
> if you have the same testcase for -m32 and -m64, but just want different
> scan-assembler for the two cases, then just guard the scan-assembler
> with lp64 resp. ia32/ilp32 target and add second one for the other target.
>
>        Jakub

 --
 ---
 Best regards,
 Michael V. Zolotukhin,
 Software Engineer
 Intel Corporation.

>>>
>>>
>>>
>>> --
>>> ---
>>> Best regards,
>>> Michael V. Zolotukhin,
>>> Software Engineer
>>> Intel Corporation.
>>>
>>
>>
>>
>> --
>> ---
>> Best regards,
>> Michael V. Zolotukhin,
>> Software Engineer
>> Intel Corporation.
>>
>
>
>
> --
> ---
> Best regards,
> Michael V. Zolotukhin,
> Software Engineer
> Intel Corporation.
>



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

Re: Re: [C++-11] User defined literals

2011-10-26 Thread 3dw4rd




Oct 26, 2011 03:39:09 PM, ja...@redhat.com wrote:

On 10/26/2011 02:00 AM, Ed Smith-Rowland wrote:
> The patch was bootstrapped and regtested on x86_64-linux-gnu.

Really?  I ran into a warning about the unused "suffix" parameter to 
interpret_integer.  So I've fixed that error.  I also added a couple of 
comments, and implemented the change to check_literal_operator_args that 
I wondered about a while back.  And checked it all in.

But we aren't quite done, I think: I notice that the lookup of operators 
doesn't match what's in 2.14.8.  For instance, I don't think this should 
be accepted:

double operator"" _foo (long long unsigned);
double d = 1.2_foo;

The lookup described in 2.14.8 involves looking through the overload set 
for a particular signature before doing normal overload resolution.

Also, we don't need to worry about argument-dependent lookup for these 
operators, since none of the arguments can have associated namespaces. 
So I think we can use lookup_name rather than lookup_function_nonclass, 
only look it up once in cp_userdef_numeric_literal, and then only build 
one call depending on the contents of the overload set.

Jason

Jason,

Thank you Jason and Tom for your help in getting this together an putting up 
with my slowness.
That warning about unused suffix didn't blow up the build.  I'm surprised 
Werror didn't kill it.

I'll check out these other issues as soon as I clean out all my work and 
rebuild on a clean tree.

I guess I need to look at what kind of number I get when processing a numeric 
literal.

Thanks again.

[C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini


Hi,

one more / RFC, for the ICE on invalid part of these issues with '->'.

The below tries to catch the problem very early, in 
cp_parser_postfix_dot_deref_expression and apparently works fine, passes 
the testsuite, etc. Is it too early? Is the check tight enough?


Thanks,
Paolo.

//
Index: testsuite/g++.dg/template/crash109.C
===
--- testsuite/g++.dg/template/crash109.C(revision 0)
+++ testsuite/g++.dg/template/crash109.C(revision 0)
@@ -0,0 +1,10 @@
+// PR c++/50864
+
+namespace impl
+{
+  template  T create();
+}
+
+template () -> impl::create())>  // { dg-error "not 
a class member" } 
+struct foo;
Index: cp/parser.c
===
--- cp/parser.c (revision 180532)
+++ cp/parser.c (working copy)
@@ -5673,6 +5673,15 @@ cp_parser_postfix_dot_deref_expression (cp_parser
{
  if (name != error_mark_node && !BASELINK_P (name) && parser->scope)
{
+ if (TREE_CODE (parser->scope) == NAMESPACE_DECL
+ && TREE_CODE (postfix_expression) == ARROW_EXPR)
+   {
+ error_at (token->location, "%<%D::%D%> is not a class member",
+   parser->scope, name);
+ parser->context->object_type = NULL_TREE;
+ return error_mark_node;
+   }
+
  name = build_qualified_name (/*type=*/NULL_TREE,
   parser->scope,
   name,

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Paul Richard Thomas

Dear Steve,

Reaping implies that there is something about it that you want to keep
:-)  Surely, weeding or herbicide spraying is better than reaping?

On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
 wrote:
> On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
>> The attach patch reaps some code that is now dead

I am sure that you are right but could you confirm that is is because
of previous patches?

>> due to my recent changes for ishftc in check.c.
>> Regression tested on i686-*-freebsd.

It looks to be OK, subject to the above, and is, I would guess,
"obvious" in any case.

Thanks for the patch

Paul

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Iain Sandoe



On 26 Oct 2011, at 17:01, Richard Henderson wrote:


On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
This fails since for u == 4 and mode == V4SFmode it attempts to  
expand

a V4SFmode shift, which is unsupported.

Shouldn't this be using the mode of the selector rather than the mode
of the result in any case?


Yes, it should use the mode of the selector.  And doing just that is
enough to fix the bug.  But I noticed that the actual results for a
constant permutation were much nastier than they ought to be.

Try this.

Iain, this might solve your case too;


yes - bootstrap incl fortran just finished (not the quickest machine ;))


no I can't test myself off cross,
because libgfortran needs the whole cross-env -- system headers and  
everything.


Ah yes .. as does Ada .. and ObjC.

I recently (when trying to figure out what was broken w.ppc Ada) did  
some Darwin crosses from Linux, not too too painful to set up (but  
then I have the stuff to hand),  but in essence prob. no worse than d/ 
l a bunch of pkg or whatever to set up a foreign linux cross-root.


cheers
Iain

Re: [C++-11] User defined literals

2011-10-26 Thread Jason Merrill


On 10/26/2011 02:00 AM, Ed Smith-Rowland wrote:

The patch was bootstrapped and regtested on x86_64-linux-gnu.


Really?  I ran into a warning about the unused "suffix" parameter to 
interpret_integer.  So I've fixed that error.  I also added a couple of 
comments, and implemented the change to check_literal_operator_args that 
I wondered about a while back.  And checked it all in.


But we aren't quite done, I think: I notice that the lookup of operators 
doesn't match what's in 2.14.8.  For instance, I don't think this should 
be accepted:


double operator"" _foo (long long unsigned);
double d = 1.2_foo;

The lookup described in 2.14.8 involves looking through the overload set 
for a particular signature before doing normal overload resolution.


Also, we don't need to worry about argument-dependent lookup for these 
operators, since none of the arguments can have associated namespaces. 
So I think we can use lookup_name rather than lookup_function_nonclass, 
only look it up once in cp_userdef_numeric_literal, and then only build 
one call depending on the contents of the overload set.


Jason
commit e2fe821867bdc55a300f400fe8340a8d7168fb46
Author: Jason Merrill 
Date:   Wed Oct 26 13:38:05 2011 -0400

interpret_int

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 46c0340..baee8eb 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -44,7 +44,7 @@ static splay_tree file_info_tree;
 int pending_lang_change; /* If we need to switch languages - C++ only */
 int c_header_level;	 /* depth in C headers - C++ only */
 
-static tree interpret_integer (const cpp_token *, unsigned int, const char *);
+static tree interpret_integer (const cpp_token *, unsigned int);
 static tree interpret_float (const cpp_token *, unsigned int, const char *);
 static tree interpret_fixed (const cpp_token *, unsigned int);
 static enum integer_type_kind narrowest_unsigned_type
@@ -329,7 +329,7 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
 	   Set PURE_ZERO to pass this information to the C++ parser.  */
 	if (tok->val.str.len == 1 && *tok->val.str.text == '0')
 	  add_flags = PURE_ZERO;
-	*value = interpret_integer (tok, flags, suffix);
+	*value = interpret_integer (tok, flags);
 	break;
 
 	  case CPP_N_FLOATING:
@@ -584,11 +584,9 @@ narrowest_signed_type (unsigned HOST_WIDE_INT low,
   return itk_none;
 }
 
-/* Interpret TOKEN, an integer with FLAGS as classified by cpplib.
-   For C++0X SUFFIX may contain a user-defined literal suffix.  */
+/* Interpret TOKEN, an integer with FLAGS as classified by cpplib.  */
 static tree
-interpret_integer (const cpp_token *token, unsigned int flags,
-		   const char *suffix)
+interpret_integer (const cpp_token *token, unsigned int flags)
 {
   tree value, type;
   enum integer_type_kind itk;
commit 9c14fb264df67fdc3a28b47c67991345af634d85
Author: Jason Merrill 
Date:   Wed Oct 26 12:36:58 2011 -0400

build_string comments

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index baee8eb..7b220ab 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -344,6 +344,8 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
 	  {
 	tree suffix_id = get_identifier (suffix);
 	int len = tok->val.str.len - strlen (suffix);
+	/* If this is going to be used as a C string to pass to a
+	   raw literal operator, we need to add a trailing NUL.  */
 	tree num_string = build_string (len + 1,
 	(const char *) tok->val.str.text);
 	TREE_TYPE (num_string) = char_array_type_node;
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5ba5008..860556c 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8592,7 +8592,7 @@ grokdeclarator (const cp_declarator *declarator,
   error ("declaration of %qD as non-function", dname);
   return error_mark_node;
 }
- 
+
   if (dname
   && TREE_CODE (dname) == IDENTIFIER_NODE
   && UDLIT_OPER_P (dname)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 840a30d..090482c 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -3667,6 +3667,7 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
 /* Parse a user-defined string constant.  Returns a call to a user-defined
literal operator taking a character pointer and the length of the string
as arguments.  */
+
 static tree
 cp_parser_userdef_string_literal (cp_token *token)
 {
diff --git a/gcc/tree.c b/gcc/tree.c
index 64c4968..2cbd68b 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1525,6 +1525,7 @@ build_real_from_int_cst (tree type, const_tree i)
 
 /* Return a newly constructed STRING_CST node whose value is
the LEN characters at STR.
+   Note that for a C string literal, LEN should include the trailing NUL.
The TREE_TYPE is not initialized.  */
 
 tree
commit d083a0d7f94fb0fe3605d499366b1b637e169c17
Author: Jason Merrill 
Date:   Wed Oct 26 12:37:32 2011 -0400

	* typeck.c (check_literal_operator_args): Avoid building types.

diff --gi

[PATCH] Pass through jump functions for addressable (scalar) parameters

2011-10-26 Thread Martin Jambor

Hi,

Honza requested that we do attempt to produce pass-through jump
functions even when the actual formal parameter that is being passed
on is addressable - provided that we can prove it has not changed
value, of course.  The following patch does this, hopefully the
limitations of our parameter-not-modified mechanism are not too strict
for this purpose.

The patch is supposed to be applied on top of the one I have just sent
that renames all parm_infos to parm_ainfos.  I have successfully
bootstrapped and tested it on x86_64-linux.

OK for trunk?

Thanks,

Martin



2011-10-26  Martin Jambor  

* ipa-prop.c (mark_modified): Moved up in the file.
(is_parm_modified_before_call): Renamed to
is_parm_modified_before_stmt, moved up in the file.
(load_from_unmodified_param): New function.
(compute_complex_assign_jump_func): Also attempt to create pass
through jump functions for values loaded from (addressable)
parameters.

* testsuite/gcc.dg/ipa/ipcp-4.c: New test.

Index: src/gcc/testsuite/gcc.dg/ipa/ipcp-4.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/ipcp-4.c
@@ -0,0 +1,68 @@
+/* Test that IPA-CP is able to produce a pass-through jump function for the
+   call of g1 and g2 even though a is addressable.  Also test that h is not
+   cloned.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp 
-fno-early-inlining"  } */
+/* { dg-add-options bind_pic_locally } */
+
+extern void use_stuff (int);
+extern void use_pointer (int *);
+
+static int
+h (int a, int b)
+{
+  int i;
+
+  for (i = 8; i <= b; i++)
+use_stuff (a+8);
+}
+
+static int
+g1 (int a, int b)
+{
+  int i;
+
+  for (i = 0; i <= b; i++)
+use_pointer (&a);
+  h (a, b);
+}
+
+static int
+g2 (int a, int b)
+{
+  int i;
+
+  for (i = 4; i <= b; i += 2)
+use_stuff (a);
+}
+
+
+static void
+f (int a, int z)
+{
+  if (z > 1)
+g1 (a, z);
+  else
+g2 (a + 4, z);
+  use_pointer (&a);
+}
+
+int
+main (int argc, char *argv[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+f (7, argc);
+  return 0;
+}
+
+
+/* { dg-final { scan-ipa-dump "Creating a specialized node of g1.*for all 
known contexts" "cp" } } */
+/* { dg-final { scan-ipa-dump "Creating a specialized node of g2.*for all 
known contexts" "cp" } } */
+/* { dg-final { scan-ipa-dump-not "Creating a specialized node of h.*for all 
known contexts" "cp" } } */
+/* { dg-final { scan-ipa-dump-times "replacing param a with const 7" 2 "cp"  } 
} */
+/* { dg-final { scan-ipa-dump "replacing param a with const 11" "cp"  } } */
+/* { dg-final { cleanup-ipa-dump "cp" } } */
+
+
Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -419,31 +419,105 @@ detect_type_change_ssa (tree arg, gimple
   return detect_type_change (arg, arg, call, jfunc, 0);
 }
 
+/* Callback of walk_aliased_vdefs.  Flags that it has been invoked to the
+   boolean variable pointed to by DATA.  */
+
+static bool
+mark_modified (ao_ref *ao ATTRIBUTE_UNUSED, tree vdef ATTRIBUTE_UNUSED,
+void *data)
+{
+  bool *b = (bool *) data;
+  *b = true;
+  return true;
+}
+
+/* Return true if the formal parameter PARM might have been modified in this
+   function before reaching the statement STMT.  PARM_AINFO is a pointer to a
+   structure containing temporary information about PARM.  */
+
+static bool
+is_parm_modified_before_stmt (struct param_analysis_info *parm_ainfo,
+ gimple stmt, tree parm)
+{
+  bool modified = false;
+  ao_ref refd;
+
+  if (parm_ainfo->modified)
+return true;
+
+  ao_ref_init (&refd, parm);
+  walk_aliased_vdefs (&refd, gimple_vuse (stmt), mark_modified,
+ &modified, &parm_ainfo->visited_statements);
+  if (modified)
+{
+  parm_ainfo->modified = true;
+  return true;
+}
+  return false;
+}
+
+/* If STMT is an assignment that loads a value from an parameter declaration,
+   return the index of the parameter in ipa_node_params which has not been
+   modified.  Otherwise return -1.  */
+
+static int
+load_from_unmodified_param (struct ipa_node_params *info,
+   struct param_analysis_info *parms_ainfo,
+   gimple stmt)
+{
+  int index;
+  tree op1;
+
+  if (!gimple_assign_single_p (stmt)
+  || gimple_assign_cast_p (stmt))
+return -1;
+
+  op1 = gimple_assign_rhs1 (stmt);
+  index = ipa_get_param_decl_index (info, op1);
+  if (index < 0
+  || is_parm_modified_before_stmt (&parms_ainfo[index], stmt, op1))
+return -1;
+
+  return index;
+}
 
 /* Given that an actual argument is an SSA_NAME (given in NAME) and is a result
of an assignment statement STMT, try to find out whether NAME can be
described by a (possibly polynomial) pass-through jump-function or an
-   ancestor jump function and if so, write

[PATCH] Mechanical renaming in ipa-prop.c

2011-10-26 Thread Martin Jambor

Hi,

I've worked with the parameter-has-not-been-modified mechanism in
ipa-prop.c after not seeing for quite some time and found the original
"infos" and temporary "parm_infos" confusing.  Thus I at least renamed
all the parm_infos (which are deallocated after the analysis ends and
which have type "struct param_analysis_info *") to parm_ainfos.

Bootstrapped and tested on x86_64-linux without any issues.
OK for trunk?

Thanks,

Martin



2011-10-25  Martin Jambor  

* ipa-prop.c (compute_pass_through_member_ptrs): Rename parm_info
to parm_ainfo.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_compute_jump_functions): Likewise.
(ipa_analyze_indirect_call_uses): Likewise.
(ipa_analyze_call_uses): Likewise.
(ipa_analyze_params_uses): Likewise.
(ipa_analyze_node): Likewise.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -794,7 +794,7 @@ is_parm_modified_before_call (struct par
 
 static bool
 compute_pass_through_member_ptrs (struct ipa_node_params *info,
- struct param_analysis_info *parms_info,
+ struct param_analysis_info *parms_ainfo,
  struct ipa_edge_args *args,
  gimple call)
 {
@@ -813,7 +813,8 @@ compute_pass_through_member_ptrs (struct
  int index = ipa_get_param_decl_index (info, arg);
 
  gcc_assert (index >=0);
- if (!is_parm_modified_before_call (&parms_info[index], call, arg))
+ if (!is_parm_modified_before_call (&parms_ainfo[index], call,
+arg))
{
  struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args,
   num);
@@ -968,7 +969,7 @@ compute_cst_member_ptr_arguments (struct
to this callsite.  */
 
 static void
-ipa_compute_jump_functions_for_edge (struct param_analysis_info *parms_info,
+ipa_compute_jump_functions_for_edge (struct param_analysis_info *parms_ainfo,
 struct cgraph_edge *cs)
 {
   struct ipa_node_params *info = IPA_NODE_REF (cs->caller);
@@ -985,7 +986,7 @@ ipa_compute_jump_functions_for_edge (str
 
   /* Let's check whether there are any potential member pointers and if so,
  whether we can determine their functions as pass_through.  */
-  if (!compute_pass_through_member_ptrs (info, parms_info, args, call))
+  if (!compute_pass_through_member_ptrs (info, parms_ainfo, args, call))
 return;
 
   /* Finally, let's check whether we actually pass a new constant member
@@ -998,7 +999,7 @@ ipa_compute_jump_functions_for_edge (str
 
 static void
 ipa_compute_jump_functions (struct cgraph_node *node,
-   struct param_analysis_info *parms_info)
+   struct param_analysis_info *parms_ainfo)
 {
   struct cgraph_edge *cs;
 
@@ -1010,11 +1011,11 @@ ipa_compute_jump_functions (struct cgrap
 functions unless they may become known during lto/whopr.  */
   if (!callee->analyzed && !flag_lto)
continue;
-  ipa_compute_jump_functions_for_edge (parms_info, cs);
+  ipa_compute_jump_functions_for_edge (parms_ainfo, cs);
 }
 
   for (cs = node->indirect_calls; cs; cs = cs->next_callee)
-ipa_compute_jump_functions_for_edge (parms_info, cs);
+ipa_compute_jump_functions_for_edge (parms_ainfo, cs);
 }
 
 /* If RHS looks like a rhs of a statement loading pfn from a member
@@ -,7 +1112,7 @@ ipa_note_param_call (struct cgraph_node
 }
 
 /* Analyze the CALL and examine uses of formal parameters of the caller NODE
-   (described by INFO).  PARMS_INFO is a pointer to a vector containing
+   (described by INFO).  PARMS_AINFO is a pointer to a vector containing
intermediate information about each formal parameter.  Currently it checks
whether the call calls a pointer that is a formal parameter and if so, the
parameter is marked with the called flag and an indirect call graph edge
@@ -1170,7 +1171,7 @@ ipa_note_param_call (struct cgraph_node
 static void
 ipa_analyze_indirect_call_uses (struct cgraph_node *node,
struct ipa_node_params *info,
-   struct param_analysis_info *parms_info,
+   struct param_analysis_info *parms_ainfo,
gimple call, tree target)
 {
   gimple def;
@@ -1283,7 +1284,7 @@ ipa_analyze_indirect_call_uses (struct c
 return;
 
   index = ipa_get_param_decl_index (info, rec);
-  if (index >= 0 && !is_parm_modified_before_call (&parms_info[index],
+  if (index >= 0 && !is_parm_modified_before_call (&parms_ainfo[index],
   call, rec))
 ipa_note_param_call (node, index, call);
 
@@ -1347,20 +1

[PATCH, testsuite] Unxfail g++.dg/tree-ssa/pr45605.C

2011-10-26 Thread Martin Jambor

Hi,

I've noticed that a test in g++.dg/tree-ssa/pr45605.C is Xfailed, even
though early FRE now does the devirtualization that it tests for.
True, the current scan happens over the SSA dump (which implies the
front-end was somehow supposed to do it?) but I think that given we do
perform the optimization, checking that we continue to do so is
useful, hence the following change.

Tested by adding to a successful bootstrap and test run on x86_64.
Unless anyone objects, I will commit the patch tomorrow.

Thanks,

Martin

2011-10-25  Martin Jambor  

* g++.dg/tree-ssa/pr45605.C: Scan fre1 dump and unxfail.

Index: src/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
===
--- src.orig/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
+++ src/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-ssa" } */
+/* { dg-options "-O1 -fdump-tree-fre1" } */
 extern "C" void abort(); 
 bool destructor_called = false; 
 
@@ -33,5 +33,5 @@ int main() {
 
 
 /* We should devirtualize call to D::Run */
-/* { dg-final { scan-tree-dump-times "D::Run \\(" 1 "ssa" { xfail *-*-* } } } 
*/
-/* { dg-final { cleanup-tree-dump "ssa" } } */
+/* { dg-final { scan-tree-dump-times "D::Run \\(" 1 "fre1" } } */
+/* { dg-final { cleanup-tree-dump "fre1" } } */

Re: [PATCH 3/3, RFC] Fixup COND_EXECs before reload

2011-10-26 Thread Alexander Monakov

This RFC patch implements conversion of COND_EXEC instructions to control flow
for pre-RA selective scheduler.  Something like this is needed to employ
predication support before reload.

Each COND_EXEC is converted separately to a new basic block with the
unconditional variant of the instruction, and a conditional jump around it.

I'm not sure what would be an acceptable approach here.  I'm also not sure
about the recommended way to emit JUMPs.


2011-10-26  Sergey Grechanik  

* sel-sched.c (convert_cond_execs): New.  Use it...
(sel_global_finish): ... here.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index f5c6f8b..b8f2663 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtlhooks-def.h"
 #include "output.h"
 #include "emit-rtl.h"
+#include "cfghooks.h"
 
 #ifdef INSN_SCHEDULING
 #include "sel-sched-ir.h"
@@ -7978,6 +7979,60 @@ sel_global_init (void)
   init_hard_regs_data ();
 }
 
+/* Convert cond_execs to jumps before reload.  */
+static void
+convert_cond_execs (void)
+{
+  basic_block bb;
+  rtx insn;
+
+  if (reload_completed)
+return;
+
+  FOR_EACH_BB (bb)
+   /* We don't need the safe variant because we break immediately after
+  removing the current instruction.  */
+FOR_BB_INSNS (bb, insn)
+  if (INSN_P (insn) && GET_CODE (PATTERN (insn)) == COND_EXEC)
+   {
+ rtx jump;
+ rtx cond = COND_EXEC_TEST (PATTERN (insn));
+ rtx rcond = reversed_comparison (cond, GET_MODE (cond));
+ rtx unpredicated = copy_rtx (COND_EXEC_CODE (PATTERN (insn)));
+
+ /* Split bb into BB, NEW_BB, NEXT_BB (in that order).  */
+ edge e1 = split_block (bb, insn);
+ basic_block next_bb = e1->dest;
+ edge e2 = split_block (bb, insn);
+ basic_block new_bb = e2->dest;
+
+ /* Emit conditional jump at the end of bb.  */
+ rtx label = block_label (next_bb);
+
+  /* FIXME  There should be a better way.  */
+ rtx jump_pat
+  = gen_rtx_SET (GET_MODE (pc_rtx), pc_rtx,
+ gen_rtx_IF_THEN_ELSE (GET_MODE (pc_rtx),
+   rcond,
+   gen_rtx_LABEL_REF (VOIDmode,
+  label),
+   pc_rtx));
+
+ make_edge (bb, next_bb, 0);
+ jump = emit_jump_insn_after (jump_pat, BB_END (bb));
+ JUMP_LABEL (jump) = label;
+ LABEL_NUSES (label)++;
+
+ emit_insn_after_noloc (unpredicated, BB_HEAD (new_bb), new_bb);
+
+ delete_insn (insn);
+ break;
+   }
+#ifdef ENABLE_CHECKING
+  verify_flow_info ();
+#endif
+}
+
 /* Free the global data of the scheduler.  */
 static void
 sel_global_finish (void)
@@ -7998,6 +8053,8 @@ sel_global_finish (void)
 
   free_sched_pools ();
   free_dominance_info (CDI_DOMINATORS);
+
+  convert_cond_execs ();
 }
 
 /* Return true when we need to skip selective scheduling.  Used for debugging. 
 */

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Steve Kargl

On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
> The attach patch reaps some code that is now dead
> due to my recent changes for ishftc in check.c.
> Regression tested on i686-*-freebsd.
> 
> 2011-10-22  Steevn G. Kargl  
> 
>   * simplify.c (gfc_simplify_ishftc): Reap dead code.

Here's a revised patch that does 2 things.  First, it adds
a check in gfc_check_nearest that the 2nd argument (if it
is a constant) is not zero.  Second, it reaps a dead code
in several of the simplification functions.  


2011-10-26  Steven G. Kargl  

* check.c (gfc_check_atan_2): Typo in comment.
(gfc_check_nearest): If 's' is constant, check that it is not 0.
* simplify.c (simplify_dshift, gfc_simplify_ibclr, gfc_simplify_ibits,
gfc_simplify_ibset, simplify_shift, gfc_simplify_ishftc,
gfc_simplify_nearest): Remove dead code.

2011-10-26  Steven G. Kargl  

* gfortran.dg/nearest_5.f90: New test.

-- 
Steve
Index: fortran/check.c
===
--- fortran/check.c	(revision 180529)
+++ fortran/check.c	(working copy)
@@ -934,7 +934,7 @@ null_arg:
 gfc_try
 gfc_check_atan_2 (gfc_expr *y, gfc_expr *x)
 {
-  /* gfc_notify_std would be a wast of time as the return value
+  /* gfc_notify_std would be a waste of time as the return value
  is seemingly used only for the generic resolution.  The error
  will be: Too many arguments.  */
   if ((gfc_option.allow_std & GFC_STD_F2008) == 0)
@@ -2710,6 +2710,16 @@ gfc_check_nearest (gfc_expr *x, gfc_expr
   if (type_check (s, 1, BT_REAL) == FAILURE)
 return FAILURE;
 
+  if (s->expr_type == EXPR_CONSTANT)
+{
+  if (mpfr_sgn (s->value.real) == 0)
+	{
+	  gfc_error ("Argument 'S' of NEAREST at %L shall not be zero",
+		 &s->where);
+	  return FAILURE;
+	}
+}
+
   return SUCCESS;
 }
 
Index: fortran/simplify.c
===
--- fortran/simplify.c	(revision 180529)
+++ fortran/simplify.c	(working copy)
@@ -1899,13 +1899,7 @@ simplify_dshift (gfc_expr *arg1, gfc_exp
   k = gfc_validate_kind (BT_INTEGER, arg1->ts.kind, false);
   size = gfc_integer_kinds[k].bit_size;
 
-  if (gfc_extract_int (shiftarg, &shift) != NULL)
-{
-  gfc_error ("Invalid SHIFT argument of DSHIFTL at %L", &shiftarg->where);
-  return &gfc_bad_expr;
-}
-
-  gcc_assert (shift >= 0 && shift <= size);
+  gfc_extract_int (shiftarg, &shift);
 
   /* DSHIFTR(I,J,SHIFT) = DSHIFTL(I,J,SIZE-SHIFT).  */
   if (right)
@@ -2509,21 +2503,10 @@ gfc_simplify_ibclr (gfc_expr *x, gfc_exp
   if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, &pos) != NULL || pos < 0)
-{
-  gfc_error ("Invalid second argument of IBCLR at %L", &y->where);
-  return &gfc_bad_expr;
-}
+  gfc_extract_int (y, &pos);
 
   k = gfc_validate_kind (x->ts.type, x->ts.kind, false);
 
-  if (pos >= gfc_integer_kinds[k].bit_size)
-{
-  gfc_error ("Second argument of IBCLR exceeds bit size at %L",
-		 &y->where);
-  return &gfc_bad_expr;
-}
-
   result = gfc_copy_expr (x);
 
   convert_mpz_to_unsigned (result->value.integer,
@@ -2551,17 +2534,8 @@ gfc_simplify_ibits (gfc_expr *x, gfc_exp
   || z->expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, &pos) != NULL || pos < 0)
-{
-  gfc_error ("Invalid second argument of IBITS at %L", &y->where);
-  return &gfc_bad_expr;
-}
-
-  if (gfc_extract_int (z, &len) != NULL || len < 0)
-{
-  gfc_error ("Invalid third argument of IBITS at %L", &z->where);
-  return &gfc_bad_expr;
-}
+  gfc_extract_int (y, &pos);
+  gfc_extract_int (z, &len);
 
   k = gfc_validate_kind (BT_INTEGER, x->ts.kind, false);
 
@@ -2614,21 +2588,10 @@ gfc_simplify_ibset (gfc_expr *x, gfc_exp
   if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, &pos) != NULL || pos < 0)
-{
-  gfc_error ("Invalid second argument of IBSET at %L", &y->where);
-  return &gfc_bad_expr;
-}
+  gfc_extract_int (y, &pos);
 
   k = gfc_validate_kind (x->ts.type, x->ts.kind, false);
 
-  if (pos >= gfc_integer_kinds[k].bit_size)
-{
-  gfc_error ("Second argument of IBSET exceeds bit size at %L",
-		 &y->where);
-  return &gfc_bad_expr;
-}
-
   result = gfc_copy_expr (x);
 
   convert_mpz_to_unsigned (result->value.integer,
@@ -3004,11 +2967,8 @@ simplify_shift (gfc_expr *e, gfc_expr *s
 
   if (e->expr_type != EXPR_CONSTANT || s->expr_type != EXPR_CONSTANT)
 return NULL;
-  if (gfc_extract_int (s, &shift) != NULL)
-{
-  gfc_error ("Invalid second argument of %s at %L", name, &s->where);
-  return &gfc_bad_expr;
-}
+
+  gfc_extract_int (s, &shift);
 
   k = gfc_validate_kind (BT_INTEGER, e->ts.kind, false);
   bitsize = gfc_integer_kinds[k].bit_size;
@@ -3146,11 +3106,7 @@ gfc_simplify_ishftc (gfc_expr *e, gfc_ex

Re: [PATCH 2/3] Predication support

2011-10-26 Thread Alexander Monakov

This patch contains the implementation of predication support.

The selective scheduler is adjusted to recognize COND_EXEC instructions.  They
can be renamed like normal assignments.  Selective scheduler itself will add
predicated variants of instructions in availability sets when computing them
on CFG splits, using the predicate of the conditional jump.

Converting an instruction to a predicated form is treated similar to other
transformations (speculation, substitition or renaming) in
{undo,redo}_transformation.  Predication cache is maintained to quickly
retrieve predicated variants of instructions.

In particular, predication support  allows to pipeline non-doloop loops on
ia64 more cleanly, as no recovery code for speculative loads is needed.  Also,
it allows to pipeline loads on arm.

2011-10-26  Alexander Monakov  

* common.opt: Add -fsel-sched-predication option.
* config/ia64/ia64.c (get_mode_no_for_insn): Support conditional loads.
* rtl.h (COND_SET_SRC_PTR, COND_SET_SRC_PTR): New macros.
* sched-deps.c (conditions_mutex_with_rev_p): Rename from
conditions_mutex_p.  Adjust the caller.
(conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): New functions.
(sched_analyze_insn): Extract LHS/RHS for conditional assignments.
* sched-int.h (conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): Declare.
* sel-sched-ir.c (vinsn_equal_p): Support conditional insns.
(vinsn_equal_skip_cond_p): New function.
(init_expr): Add new argument (was_predicated).  Update all callers. 
(merge_expr_data): Assert equality of merged predicates.
(av_set_add_nocopy, join_distinct_sets): Export.
(av_set_tail, av_set_concat): New functions.
(av_set_union_and_live): Add parameter to_fallthru_p, generate
predicated insns.
(setup_id_lhs_rhs): Support conditional insns.
(setup_id_cond): New function.
(init_id_from_df, deps_init_id): Use it.
(get_dest_and_mode): Support conditional insns.
* sel-sched-ir.h (enum local_trans_type): Add TRANS_PREDICATION.
(struct _expr): Add new member (was_predicated).
(EXPR_COND, EXPR_WAS_PREDICATED): New accessor macros.
(struct idata_def): Add new member (cond).
(IDATA_COND, VINSN_COND, INSN_COND): New accessor macros.
(vinsn_equal_skip_cond_p, maybe_predicate_expr_into,
av_set_add_nocopy, av_set_intersect, join_distinct_sets,
av_set_concat): Declare.
(av_set_union_and_live): Change prototype.
(in_fallthru_bb_p): Add new parameter (skip_nops_p).  Update all
callers.
(can_substitute_through_p): Do not substitute through conditional
insns.
(substitute_reg_in_expr, replace_src_with_reg_ok_p): Support
substitution in conditional moves.
(create_insn_rtx_with_rhs, replace_dest_with_reg_ok_p,
replace_dest_with_reg_in_expr): Support renaming of conditional moves.
(create_insn_rtx_with_lhs): Delete.
(mark_unavailable_hard_regs, choose_best_reg_1,
choose_best_pseudo_reg, verify_target_availability): Support
conditional insns.
(undo_transformations): Add support for TRANS_PREDICATION.
(redo_transformations): Ditto.
(moveup_expr): Add support for predicated exprs.
(struct predication_cache, predication_cache_hash,
predication_cache_free, predication_cache_eq,
find_in_predication_cache, predicate_expr, maybe_predicate_expr_into,
populate_av_set_with_predicated, filter_av_set_by_predicate): New.
(compute_av_set_at_bb_end): Generate predicated instructions.
(sel_rank_for_schedule): Adjust heuristics.
(vinsn_vec_has_expr_p): Allow inexact match for predicated insns.
(move_op_orig_expr_not_found): Ignore insns with matching LHS if the
condition is mutually exclusive.
(code_motion_process_successors): Add new argument (cond).  Use it to
skip predicated-off paths.  Update the caller.
(av_set_common_cond): New.
(sel_region_init): Initialize predication cache.
(sel_region_finish): Free predication cache.


diff --git a/gcc/common.opt b/gcc/common.opt
index d1f286f..3a7114d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1695,6 +1695,10 @@ fsel-sched-reschedule-pipelined
 Common Report Var(flag_sel_sched_reschedule_pipelined) Init(0) Optimization
 Reschedule pipelined regions without pipelining
 
+fsel-sched-predication
+Common Report Var(flag_sel_sched_predication) Init(0) Optimization
+Perform predication in selective scheduling
+
 ; sched_stalled_insns means that insns can be moved prematurely from the queue
 ; of stalled insns into the ready list.
 fsched-stalled-insns
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 78d2441..94dbff2 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/

PATCH RFC: Fix cross-compiler include path if not using --with-sysroot

2011-10-26 Thread Ian Lance Taylor

In my earlier patch to add support for --with-native-system-header-dir,
I messed up the case of a cross-compiler when not using --with-sysroot.
In that case the compiler was incorrectly searching /usr/include.  This
patch restores the logic to what it used to be before my patch.  As the
patch is straightforward and simply restores the code to what it used to
be before my earlier patch, I plan to go ahead and commit this after
bootstrap and testing passes.  Please let me know if you object.

Ian


2011-10-26  Ian Lance Taylor  

* cppdefault.c: Undef NATIVE_SYSTEM_HEADER_DIR if
CROSS_DIRECTORY_STRUCTURE is defined and TARGET_SYSTEM_ROOT is
not.
(cpp_include_defaults): Only use NATIVE_SYSTEM_HEADER_DIR if it is
defined.


Index: gcc/cppdefault.c
===
--- gcc/cppdefault.c	(revision 180530)
+++ gcc/cppdefault.c	(working copy)
@@ -32,6 +32,7 @@
 
 #if defined (CROSS_DIRECTORY_STRUCTURE) && !defined (TARGET_SYSTEM_ROOT)
 # undef LOCAL_INCLUDE_DIR
+# undef NATIVE_SYSTEM_HEADER_DIR
 #else
 # undef CROSS_INCLUDE_DIR
 #endif
@@ -84,8 +85,10 @@ const struct default_include cpp_include
 /* Another place the target system's headers might be.  */
 { TOOL_INCLUDE_DIR, "BINUTILS", 0, 1, 0, 0 },
 #endif
+#ifdef NATIVE_SYSTEM_HEADER_DIR
 /* /usr/include comes dead last.  */
 { NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 0 },
+#endif
 { 0, 0, 0, 0, 0, 0 }
   };
 #endif /* no INCLUDE_DEFAULTS */

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill


OK.

Jason

Re: [PATCH] rs6000: Delete the remaining vec_extract expanders.

2011-10-26 Thread Richard Henderson

On 10/25/2011 04:38 PM, Michael Meissner wrote:
> On Tue, Oct 25, 2011 at 03:01:37PM -0700, Richard Henderson wrote:
>> Now that expand_binop handles lowering vec_extract_even to vec_perm,
>> we can remove the last two unnecessary vec_extract patterns from the
>> Altivec backend.
>>
>> Ok?
> 
> Just to be sure, the lowering doesn't depend on -ftree-vectorize (i.e. in case
> people are using the buitlins).  I assume it doesn't.

No, this lowering happens during expansion to rtl, similar to NEG x -> SUB 0, x
and other expansion variants that optabs.c applies.


r~

Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Richard Henderson

On 10/25/2011 06:45 PM, Andrew MacLeod wrote:
>   * builtins.c (expand_builtin_atomic_fetch_op): External calls for
>   'op_fetch' builtins need to instead call 'fetch_op' externals and issue 
>   correction code.
>   (expand_builtin): Provide proper builtin name for external call and
>   ignored flag to expand_builtin_atomic_fetch_op.

Looks ok.

> !resolved to an instriction sequence.  */

"instruction"

> !   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
> !   TREE_OPERAND (addr, 0) = builtin_decl_explicit(ext_call);

I do wonder if we ought not put the original back after the expand.


r~

Re: [C++ Patch] BASELINK_P clean up

2011-10-26 Thread Jason Merrill


On 10/26/2011 12:41 PM, Paolo Carlini wrote:

we have an inconsistent mix: shall we do the below?


Sure, why not.

Jason

Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Richard Henderson

On 10/26/2011 10:25 AM, Andrew MacLeod wrote:
> hum. I suppose that wouldn't hurt. I don't think it should make any
> difference since Im not modifying whats in the table, but I can give
> it a shot. If it bootstraps, which I presume it will, I'll check it
> in with the original decl replaced.

The other thing that just occurred to me is that we're lying to cgraph
about the functions that wind up being referenced.

But since different function references can come in via libcalls, I
don't know how different this case is.

It'll be interesting to see what happens with LTO enabled here...

r~

Re: [PATCH 1/3] Transformation replay mechanism

2011-10-26 Thread Alexander Monakov

(Note, this is a prerequisite for predication support but also is an
 improvement on its own.  This patch can be installed separately.)

This patch implements transformation replay mechanism (redo_transformations).
Currently, after choosing an expression to schedule and finding its original
expressions below the scheduling fence in move_op, we need to move the
original expressions up (e.g. to correctly generate bookkeeping copies).  That
moving up would call moveup_expr_cached.

The patch slightly optimizes this by performing lookups in the history vector
of the chosen expression, which must contain all transformations of all
original expressions.  Not much changes are needed: the most notable is
perhaps that for speculation transformation, both old and new speculative
statuses are saved.

This patch is needed by the following predication patch, which uses
redo_transformations to correctly apply inverted/non-inverted predicate on
jumps without needing to remember the branch from which the expression
originated.

2011-10-26  Alexander Monakov  

* sel-sched-ir.c (free_history_vect): Export.
(insert_in_history_vect): Rename argument spec_ds to old_spec_ds, add
new argument (new_spec_ds).  Record it in the history vector.  Update
all callers.
(init_expr): Assert that initial history is empty.  Simplify code.
(copy_history_of_changes): Split out from copy_expr.
* sel-sched-ir.h (struct expr_history_def_1): Rename spec_ds to
old_spec_ds.  Add new member (new_spec_ds).
(copy_history_of_changes, free_history_vect): Declare.
(insert_in_history_vect): Adjust prototype.
* sel-sched.c (undo_transformations): Adjust.
(move_exprs_to_boundary): Discard history of expression to be
scheduled.
(move_op_orig_expr_found): Populate history vector of c_expr for use
in redo_transformations.
(redo_transformations): New.
(move_op_ascend): Use redo_transformations.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 1bbc3ed..1a73308 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -149,7 +149,6 @@ static expr_t set_insn_init (expr_t, vinsn_t, int);
 
 static void cfg_preds (basic_block, insn_t **, int *);
 static void prepare_insn_expr (insn_t, int);
-static void free_history_vect (VEC (expr_history_def, heap) **);
 
 static void move_bb_info (basic_block, basic_block);
 static void remove_empty_bb (basic_block, bool);
@@ -1503,13 +1502,13 @@ find_in_history_vect (VEC(expr_history_def, heap) 
*vect, rtx insn,
 
 /* Insert new element in a sorted history vector pointed to by PVECT,
if it is not there already.  The element is searched using
-   UID/NEW_EXPR_VINSN pair.  TYPE, OLD_EXPR_VINSN and SPEC_DS save
-   the history of a transformation.  */
+   UID/NEW_EXPR_VINSN pair.  TYPE, OLD_EXPR_VINSN, OLD_SPEC_DS and
+   NEW_SPEC_DS save the history of a transformation.  */
 void
 insert_in_history_vect (VEC (expr_history_def, heap) **pvect,
 unsigned uid, enum local_trans_type type,
 vinsn_t old_expr_vinsn, vinsn_t new_expr_vinsn,
-ds_t spec_ds)
+ds_t old_spec_ds, ds_t new_spec_ds)
 {
   VEC(expr_history_def, heap) *vect = *pvect;
   expr_history_def temp;
@@ -1525,15 +1524,16 @@ insert_in_history_vect (VEC (expr_history_def, heap) 
**pvect,
   /* It is possible that speculation types of expressions that were
  propagated through different paths will be different here.  In this
  case, merge the status to get the correct check later.  */
-  if (phist->spec_ds != spec_ds)
-phist->spec_ds = ds_max_merge (phist->spec_ds, spec_ds);
+  if (phist->old_spec_ds != old_spec_ds)
+phist->old_spec_ds = ds_max_merge (phist->old_spec_ds, old_spec_ds);
   return;
 }
 
   temp.uid = uid;
   temp.old_expr_vinsn = old_expr_vinsn;
   temp.new_expr_vinsn = new_expr_vinsn;
-  temp.spec_ds = spec_ds;
+  temp.old_spec_ds = old_spec_ds;
+  temp.new_spec_ds = new_spec_ds;
   temp.type = type;
 
   vinsn_attach (old_expr_vinsn);
@@ -1543,7 +1543,7 @@ insert_in_history_vect (VEC (expr_history_def, heap) 
**pvect,
 }
 
 /* Free history vector PVECT.  */
-static void
+void
 free_history_vect (VEC (expr_history_def, heap) **pvect)
 {
   unsigned i;
@@ -1576,7 +1576,7 @@ merge_history_vect (VEC (expr_history_def, heap) **pvect,
   for (i = 0; VEC_iterate (expr_history_def, from, i, phist); i++)
 insert_in_history_vect (pvect, phist->uid, phist->type,
 phist->old_expr_vinsn, phist->new_expr_vinsn,
-phist->spec_ds);
+phist->old_spec_ds, phist->new_spec_ds);
 }
 
 /* Compare two vinsns as rhses if possible and as vinsns otherwise.  */
@@ -1632,10 +1632,8 @@ init_expr (expr_t expr, vinsn_t vi, int spec, int use, 
int priority,
   EXPR_SPEC_DONE_DS (expr) = spec_done_ds;
   EXPR_SPE

Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Andrew MacLeod


On 10/26/2011 01:19 PM, Richard Henderson wrote:



!resolved to an instriction sequence.  */

"instruction"

no no, its a mashup of "restricted instruction".
ok, maybe not.

!   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
!   TREE_OPERAND (addr, 0) = builtin_decl_explicit(ext_call);

I do wonder if we ought not put the original back after the expand.
hum. I suppose that wouldn't hurt.  I don't think it should make any 
difference since Im not modifying whats in the table, but I can give it 
a shot. If it bootstraps, which I presume it will, I'll  check it in 
with the original decl replaced.


Andrew

[PATCH 0/3] Predication support for selective scheduler

2011-10-26 Thread Alexander Monakov

Hello,

This patch series implements predication support for the selective scheduler.
It is implemented as a transformation similar to speculative execution, but
simpler in some aspects (e.g. does not require recovery code).  On ia64,
combining speculation and conditional execution is supported.  Combining of 
predicates is not supported.

The patches have been bootstrapped and regtested on x86_64, ia64 and arm with
selective scheduling (+ pipelining and predication) enabled at -O2.  OK for
trunk?

Alexander

[C++ Patch] BASELINK_P clean up

2011-10-26 Thread Paolo Carlini


Hi,

we have an inconsistent mix: shall we do the below? Test in progress on 
x86_64-linux.


Thanks,
Paolo.

//
2011-10-26  Paolo Carlini  

* typeck.c (cp_build_addr_expr_1): Use BASELINK_P.
* class.c (instantiate_type): Likewise.
* pt.c (convert_nontype_argument_function, uses_template_parms,
tsubst_copy, resolve_nondeduced_context, type_dependent_expression_p):
Likewise.
* semantics.c (finish_decltype_type): Likewise.
* decl2.c (mark_used): Likewise.
* name-lookup.c (arg_assoc): Likewise.

Index: typeck.c
===
--- typeck.c(revision 180528)
+++ typeck.c(working copy)
@@ -4946,7 +4946,7 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue
   if (TREE_CODE (arg) == OFFSET_REF)
PTRMEM_OK_P (val) = PTRMEM_OK_P (arg);
 }
-  else if (TREE_CODE (TREE_OPERAND (arg, 1)) == BASELINK)
+  else if (BASELINK_P (TREE_OPERAND (arg, 1)))
 {
   tree fn = BASELINK_FUNCTIONS (TREE_OPERAND (arg, 1));
 
Index: class.c
===
--- class.c (revision 180528)
+++ class.c (working copy)
@@ -6873,7 +6873,7 @@ instantiate_type (tree lhstype, tree rhs, tsubst_f
}
 }
 
-  if (TREE_CODE (rhs) == BASELINK)
+  if (BASELINK_P (rhs))
 {
   access_path = BASELINK_ACCESS_BINFO (rhs);
   rhs = BASELINK_FUNCTIONS (rhs);
Index: pt.c
===
--- pt.c(revision 180528)
+++ pt.c(working copy)
@@ -5277,7 +5277,7 @@ convert_nontype_argument_function (tree type, tree
   fn_no_ptr = fn;
   if (TREE_CODE (fn_no_ptr) == ADDR_EXPR)
 fn_no_ptr = TREE_OPERAND (fn_no_ptr, 0);
-  if (TREE_CODE (fn_no_ptr) == BASELINK)
+  if (BASELINK_P (fn_no_ptr))
 fn_no_ptr = BASELINK_FUNCTIONS (fn_no_ptr);
  
   /* [temp.arg.nontype]/1
@@ -7801,7 +7801,7 @@ uses_template_parms (tree t)
   || EXPR_P (t)
   || TREE_CODE (t) == TEMPLATE_PARM_INDEX
   || TREE_CODE (t) == OVERLOAD
-  || TREE_CODE (t) == BASELINK
+  || BASELINK_P (t)
   || TREE_CODE (t) == IDENTIFIER_NODE
   || TREE_CODE (t) == TRAIT_EXPR
   || TREE_CODE (t) == CONSTRUCTOR
@@ -11993,7 +11993,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
 base, name,
 /*template_p=*/false);
  }
-   else if (TREE_CODE (name) == BASELINK)
+   else if (BASELINK_P (name))
  name = tsubst_baselink (name,
  non_reference (TREE_TYPE (object)),
  args, complain,
@@ -15197,7 +15197,7 @@ resolve_nondeduced_context (tree orig_expr)
   offset = expr;
   expr = TREE_OPERAND (expr, 1);
 }
-  if (TREE_CODE (expr) == BASELINK)
+  if (BASELINK_P (expr))
 {
   baselink = expr;
   expr = BASELINK_FUNCTIONS (expr);
@@ -19314,7 +19314,7 @@ type_dependent_expression_p (tree expression)
   if (TREE_CODE (expression) == SCOPE_REF)
return false;
 
-  if (TREE_CODE (expression) == BASELINK)
+  if (BASELINK_P (expression))
expression = BASELINK_FUNCTIONS (expression);
 
   if (TREE_CODE (expression) == TEMPLATE_ID_EXPR)
Index: semantics.c
===
--- semantics.c (revision 180528)
+++ semantics.c (working copy)
@@ -5115,7 +5115,7 @@ finish_decltype_type (tree expr, bool id_expressio
step.  */
 expr = TREE_OPERAND (expr, 1);
 
-  if (TREE_CODE (expr) == BASELINK)
+  if (BASELINK_P (expr))
 /* See through BASELINK nodes to the underlying function.  */
 expr = BASELINK_FUNCTIONS (expr);
 
Index: decl2.c
===
--- decl2.c (revision 180528)
+++ decl2.c (working copy)
@@ -4174,7 +4174,7 @@ mark_used (tree decl)
  like the DECL for the function.  Otherwise, if the BASELINK is
  for an overloaded function, we don't know which function was
  actually used until after overload resolution.  */
-  if (TREE_CODE (decl) == BASELINK)
+  if (BASELINK_P (decl))
 {
   decl = BASELINK_FUNCTIONS (decl);
   if (really_overloaded_fn (decl))
Index: name-lookup.c
===
--- name-lookup.c   (revision 180528)
+++ name-lookup.c   (working copy)
@@ -5329,7 +5329,7 @@ arg_assoc (struct arg_lookup *k, tree n)
 n = TREE_OPERAND (n, 1);
   while (TREE_CODE (n) == TREE_LIST)
 n = TREE_VALUE (n);
-  if (TREE_CODE (n) == BASELINK)
+  if (BASELINK_P (n))
 n = BASELINK_FUNCTIONS (n);
 
   if (TREE_CODE (n) == FUNCTION_DECL)

Re: new patches using -fopt-info (issue5294043)

2011-10-26 Thread Xinliang David Li

I am hoping that too:) Yes, I will try to do it when I find some time.

David

On Wed, Oct 26, 2011 at 1:37 AM, Richard Guenther
 wrote:
> On Tue, Oct 25, 2011 at 9:30 PM, Xinliang David Li  wrote:
>>
>>
>> On Tue, Oct 25, 2011 at 1:02 AM, Richard Guenther
>>  wrote:
>>>
>>> On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li 
>>> wrote:
>>> >> Well, you seem to keep not reading what I write.  I am not opposed
>>> >> to adding -fopt-info/report nor to funnel messages to stdout/err.  What
>>> >> I am opposed is the way you want to introduce them.  I want you to
>>> >> fix what we dump into dump files, so that both -fopt-report and
>>> >> -fopt-info
>>> >> can be implemented by outputting selected pieces of the dump file
>>> >> to stdout/stderr.  We already have -fdump-*-stats which supposedly
>>> >> could match -fopt-report, and the default -fdump-* should be what
>>> >> goes to -fopt-info (minus the function bodies, of course).
>>> >
>>> > That sounds good. What you propose seems like
>>> >
>>> > -fdump-pass-[ir_only|transformation|debug]-stderr
>>> >
>>> > and -fopt-info is a short cut for
>>> > -fdump-tree-all-transformations-stderr
>>> > -fdump-ipa-all-tranformations-stderr
>>> > -fdump-rtl-all-transformations-stderr
>>>
>>> Yes.  Note that I don't like it the way the vectorizer does (with
>>> -fvectorizer-verbose=... the dump files are empty).  The dump
>>> file content should be unchanged when redirecting (parts) to
>>> stderr, so we have to arrange to duplicate messages in two places.
>>>
>> Vectorizer dump is a good candidate for clean up when the dumping
>> infrastructure improvement is done.
>> FYI, for now, we will implement the opt-info for some passes in the simple
>> way in google branches, and later migrate to trunk when the dumping
>> infrastructure is improved.
>
> I was hoping you would volunteer to improve the dumping infrastructure.
>
> Richard.
>

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Ulrich Weigand

Richard Henderson wrote:
> On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
> > This fails since for u == 4 and mode == V4SFmode it attempts to expand
> > a V4SFmode shift, which is unsupported.
> > 
> > Shouldn't this be using the mode of the selector rather than the mode
> > of the result in any case?
> 
> Yes, it should use the mode of the selector.  And doing just that is
> enough to fix the bug.  But I noticed that the actual results for a
> constant permutation were much nastier than they ought to be.
> 
> Try this.

Thanks, this has indeed fixed the build problem.  Full test suite is
still running ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Richard Henderson

On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
> This fails since for u == 4 and mode == V4SFmode it attempts to expand
> a V4SFmode shift, which is unsupported.
> 
> Shouldn't this be using the mode of the selector rather than the mode
> of the result in any case?

Yes, it should use the mode of the selector.  And doing just that is
enough to fix the bug.  But I noticed that the actual results for a
constant permutation were much nastier than they ought to be.

Try this.

Iain, this might solve your case too; no I can't test myself off cross,
because libgfortran needs the whole cross-env -- system headers and everything.


r~
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 9afc911..736d826 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6912,7 +6912,7 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx v1, 
rtx sel, rtx target)
   enum insn_code icode;
   enum machine_mode qimode;
   unsigned int i, w, e, u;
-  rtx tmp, sel_qi;
+  rtx tmp, sel_qi = NULL;
   rtvec vec;
 
   if (!target || GET_MODE (target) != mode)
@@ -6946,23 +6946,23 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx 
v1, rtx sel, rtx target)
   /* Fall back to a constant byte-based permutation.  */
   if (qimode != VOIDmode)
{
- icode = direct_optab_handler (vec_perm_const_optab, qimode);
- if (icode != CODE_FOR_nothing)
+ vec = rtvec_alloc (w);
+ for (i = 0; i < e; ++i)
{
- vec = rtvec_alloc (w);
- for (i = 0; i < e; ++i)
-   {
- unsigned int j, this_e;
+ unsigned int j, this_e;
 
- this_e = INTVAL (XVECEXP (sel, 0, i));
- this_e &= 2 * e - 1;
- this_e *= u;
+ this_e = INTVAL (XVECEXP (sel, 0, i));
+ this_e &= 2 * e - 1;
+ this_e *= u;
 
- for (j = 0; j < u; ++j)
-   RTVEC_ELT (vec, i * u + j) = GEN_INT (this_e + j);
-   }
- sel_qi = gen_rtx_CONST_VECTOR (qimode, vec);
+ for (j = 0; j < u; ++j)
+   RTVEC_ELT (vec, i * u + j) = GEN_INT (this_e + j);
+   }
+ sel_qi = gen_rtx_CONST_VECTOR (qimode, vec);
 
+ icode = direct_optab_handler (vec_perm_const_optab, qimode);
+ if (icode != CODE_FOR_nothing)
+   {
  tmp = expand_vec_perm_1 (icode, gen_lowpart (qimode, target),
   gen_lowpart (qimode, v0),
   gen_lowpart (qimode, v1), sel_qi);
@@ -6989,47 +6989,53 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx 
v1, rtx sel, rtx target)
   if (icode == CODE_FOR_nothing)
 return NULL_RTX;
 
-  /* Multiply each element by its byte size.  */
-  if (u == 2)
-sel = expand_simple_binop (mode, PLUS, sel, sel, sel, 0, OPTAB_DIRECT);
-  else
-sel = expand_simple_binop (mode, ASHIFT, sel, GEN_INT (exact_log2 (u)),
-  sel, 0, OPTAB_DIRECT);
-  gcc_assert (sel != NULL);
-
-  /* Broadcast the low byte each element into each of its bytes.  */
-  vec = rtvec_alloc (w);
-  for (i = 0; i < w; ++i)
-{
-  int this_e = i / u * u;
-  if (BYTES_BIG_ENDIAN)
-   this_e += u - 1;
-  RTVEC_ELT (vec, i) = GEN_INT (this_e);
-}
-  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-  sel = gen_lowpart (qimode, sel);
-  sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
-  gcc_assert (sel != NULL);
-
-  /* Add the byte offset to each byte element.  */
-  /* Note that the definition of the indicies here is memory ordering,
- so there should be no difference between big and little endian.  */
-  vec = rtvec_alloc (w);
-  for (i = 0; i < w; ++i)
-RTVEC_ELT (vec, i) = GEN_INT (i % u);
-  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-  sel = expand_simple_binop (qimode, PLUS, sel, tmp, sel, 0, OPTAB_DIRECT);
-  gcc_assert (sel != NULL);
+  if (sel_qi == NULL)
+{
+  /* Multiply each element by its byte size.  */
+  enum machine_mode selmode = GET_MODE (sel);
+  if (u == 2)
+   sel = expand_simple_binop (selmode, PLUS, sel, sel,
+  sel, 0, OPTAB_DIRECT);
+  else
+   sel = expand_simple_binop (selmode, ASHIFT, sel,
+  GEN_INT (exact_log2 (u)),
+  sel, 0, OPTAB_DIRECT);
+  gcc_assert (sel != NULL);
+
+  /* Broadcast the low byte each element into each of its bytes.  */
+  vec = rtvec_alloc (w);
+  for (i = 0; i < w; ++i)
+   {
+ int this_e = i / u * u;
+ if (BYTES_BIG_ENDIAN)
+   this_e += u - 1;
+ RTVEC_ELT (vec, i) = GEN_INT (this_e);
+   }
+  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
+  sel = gen_lowpart (qimode, sel);
+  sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
+  gcc_assert (sel != NULL);
+
+  /* Add the byte offset to each byte element.  */
+  /* Note that the definition of the indicies h

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini


On 10/26/2011 05:27 PM, Jason Merrill wrote:

On 10/26/2011 10:39 AM, Paolo Carlini wrote:

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while
we were triaging PR50864. In short, in tsubst_copy_and_build, for
COMPONENT_REF, we call tsubst_baselink with an object which in this case
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE.


That's OK, we can pass a null type to tsubst_baselink.  Just fix 
non_reference to pass NULL_TREE through.
Ah, very good (just read the comment preceding tsubst_baselink, should 
have done it earlier today ;) Thus I'm finishing testing the below, Ok 
if it passes?


Thanks,
Paolo.

/

/cp
2011-10-26  Paolo Carlini  

PR c++/50870
* typeck.c (non_reference): Pass NULL_TREE through.

/testsuite
2011-10-26  Paolo Carlini  

PR c++/50870
* g++.dg/cpp0x/decltype34.C: New.
Index: testsuite/g++.dg/cpp0x/decltype34.C
===
--- testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
@@ -0,0 +1,19 @@
+// PR c++/50870
+// { dg-options "-std=gnu++0x" }
+
+struct impl
+{
+  template  static T create();
+};
+
+template()->impl::create())>
+struct tester{};
+
+tester ti;
+
+template()->impl::create())>
+int test() { return 0; }
+
+int i = test();
Index: cp/typeck.c
===
--- cp/typeck.c (revision 180528)
+++ cp/typeck.c (working copy)
@@ -8322,7 +8322,7 @@ casts_away_constness (tree t1, tree t2)
 tree
 non_reference (tree t)
 {
-  if (TREE_CODE (t) == REFERENCE_TYPE)
+  if (t && TREE_CODE (t) == REFERENCE_TYPE)
 t = TREE_TYPE (t);
   return t;
 }

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz

Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

> So you would mean that memory dereferencing shouldn't be considered as 
> side-effect at all?

No.  I haven't said this at all.  Of course it's a side-effect, but we're 
allowed to remove existing ones (under some circumstances).  We're not 
allowed to introduce new ones, which means that this ...

> So we would happily cause by code 'if (i && *i != 0) an crash, as 
> memory-dereference has for you no side-effect?

... is not allowed.  But in the original example the memread was on the 
left side, hence occured always, therefore we can move it to the right 
side, even though it might occur less often.

> In you special case it might be valid that, if first (and C-fold-const 
> doesn't know if the side-effect condition is really the first, as it 
> might be a sub-sequence of a condition) condition might trap or not, to 
> combine it.  But branching has to cover the general cases.  If you find 
> a way to determine that left-hand operand in fold_const's branching code 
> is really the left-most condition in chain, then we can add such a 
> special case, but I don't see here an easy way to determine it.

Hmm?  I don't see why it's necessary to check if it's the left-most 
condition in a chain.  If the left hand of '&&' is a memread it can always 
be moved to the right side (or the operator transformed into '&' which can 
have the same effect), of course only if the original rhs is free of side 
effects, but then independed if the && was part of a larger expression.  
The memread will possibly be done fewer times than originally, but as 
said, that's okay.

Ciao,
Michael.

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor

Rainer Orth  writes:

> the problem is another one: using /usr/xpg4/bin/awk, I find:
>
> /usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function "m"
>
> nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
> state:
>
> `match(STRING, REGEXP [, ARRAY])'
> [...]
>  The ARRAY argument to `match' is a `gawk' extension.  In
>  compatibility mode (*note Options::), using a third argument is a
>  fatal error.

Ah.  I took a look at the nawk man page, and I don't see any evidence
that it supports submatches at all.  How annoying.

Ian

Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill


On 10/26/2011 10:39 AM, Paolo Carlini wrote:

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while
we were triaging PR50864. In short, in tsubst_copy_and_build, for
COMPONENT_REF, we call tsubst_baselink with an object which in this case
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE.


That's OK, we can pass a null type to tsubst_baselink.  Just fix 
non_reference to pass NULL_TREE through.


Jason

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz

2011/10/26 Michael Matz :
> Hi,
>
> On Wed, 26 Oct 2011, Kai Tietz wrote:
>
>> well, if such a function is used as inline and we know for it that j has
>> value != 2, then we have here a big difference.  For your first example,
>> we still have to do the memory access to *i, even if we are not
>> interested in result.
>
> Actually we don't have to preserve memory accesses.  The interesting case
> is if the pointer has an invalid value.  The behaviour of the access then
> is undefined, and it's okay to not do it at all.  In case the pointer does
> point to an object the access (if it's value isn't needed) also isn't
> necessary.  IOW: in "void f(int *p) { int i = *p; }" we can always remove
> the pointer read.  So, I still maintain that the transformation on the
> original example was okay.
>
>
> Ciao,
> Michael.

So you would mean that memory dereferencing shouldn't be considered as
side-effect at all?

So we would happily cause by code 'if (i && *i != 0) an crash, as
memory-dereference has for you no side-effect?

In you special case it might be valid that, if first (and C-fold-const
doesn't know if the side-effect condition is really the first, as it
might be a sub-sequence of a condition) condition might trap or not,
to combine it.  But branching has to cover the general cases.  If you
find a way to determine that left-hand operand in fold_const's
branching code is really the left-most condition in chain, then we can
add such a special case, but I don't see here an easy way to determine
it.

Regards,
Kai

Hmm, not sure

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Rainer Orth

Ian,

> Hmmm, I don't have a copy of nawk.  Is it possible that it doesn't like
> " *" == *?  You could try "[ ]*".

the problem is another one: using /usr/xpg4/bin/awk, I find:

/usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function "m"

nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
state:

`match(STRING, REGEXP [, ARRAY])'
[...]
 The ARRAY argument to `match' is a `gawk' extension.  In
 compatibility mode (*note Options::), using a third argument is a
 fatal error.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH][RFC] Re-write LTO option merging

2011-10-26 Thread Joseph S. Myers

On Wed, 26 Oct 2011, Richard Guenther wrote:

> Index: trunk/gcc/opts-global.c
> ===
> *** trunk.orig/gcc/opts-global.c  2011-10-26 13:46:24.0 +0200
> --- trunk/gcc/opts-global.c   2011-10-26 13:48:42.0 +0200
> *** static void
> *** 167,176 
>   post_handling_callback (const struct cl_decoded_option *decoded 
> ATTRIBUTE_UNUSED,
>   unsigned int mask ATTRIBUTE_UNUSED)
>   {
> - #ifdef ENABLE_LTO
> -   lto_register_user_option (decoded->opt_index, decoded->arg,
> - decoded->value, mask);
> - #endif
>   }

It looks like this leaves both implementations of this callback (driver 
and compiler proper) empty, so all support for the post_handling_callback 
can be removed.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra

On Wed, Oct 26, 2011 at 03:59:36PM +0200, Bernd Schmidt wrote:
> On 10/26/11 15:54, Alan Modra wrote:
> > I guess the tradeoff between the classic shrink-wrap epilogue scheme
> > and my duplicate tail idea is whether duplicating tail blocks adds
> > more code than duplicating epilogues.  From what I've seen, the
> > duplicate tails are generally very small.  I guess I should dump out
> > some info so we can get a better idea.
> 
> I suppose if one wanted to avoid inserting more than one epilogue for
> code-size reasons, one could make a new basic block containing the
> epilogue, and redirect edges as appropriate.

Suppose you have a function that returns r3=0 in one tail block and
r3=1 in another, and these blocks are reached both by paths needing
a prologue and by paths not needing a prologue.  Which seems a likely
common case.  I'm fairly certain that would require two copies of the
normal epilogue, or duplicating the tail blocks.  (But it's late here
and I'm ready to nod off so may not be thinking straight.)

-- 
Alan Modra
Australia Development Lab, IBM

[C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini


Hi,

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while 
we were triaging PR50864. In short, in tsubst_copy_and_build, for 
COMPONENT_REF, we call tsubst_baselink with an object which in this case 
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE. I'm trying to fix 
this be using the first operand instead, as we normally do for 
ARROW_EXPRs elsewhere. Note that, assuming this makes sense, we probably 
want to audit also case COMPONENT_REF in tsubst_copy...


What do you think?

(Tested already x86_64-linux)

Thanks,
Paolo.

//
/cp
2011-10-26  Paolo Carlini  

PR c++/50870
* pt.c (tsubst_copy_and_build): When object is an ARROW_EXPR
pass to tsubst_baselink its first operand.

/testsuite
2011-10-26  Paolo Carlini  

PR c++/50870
* g++.dg/cpp0x/decltype34.C: New.
Index: testsuite/g++.dg/cpp0x/decltype34.C
===
--- testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
@@ -0,0 +1,19 @@
+// PR c++/50870
+// { dg-options "-std=gnu++0x" }
+
+struct impl
+{
+  template  static T create();
+};
+
+template()->impl::create())>
+struct tester{};
+
+tester ti;
+
+template()->impl::create())>
+int test() { return 0; }
+
+int i = test();
Index: cp/pt.c
===
--- cp/pt.c (revision 180520)
+++ cp/pt.c (working copy)
@@ -13711,7 +13711,9 @@ tsubst_copy_and_build (tree t,
member = TREE_OPERAND (t, 1);
if (BASELINK_P (member))
  member = tsubst_baselink (member,
-   non_reference (TREE_TYPE (object)),
+   TREE_CODE (object) == ARROW_EXPR
+   ? TREE_OPERAND (object, 0)
+   : non_reference (TREE_TYPE (object)),
args, complain, in_decl);
else
  member = tsubst_copy (member, args, complain, in_decl);

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Ulrich Weigand

Richard Henderson wrote:

>   Fix typos in the names of vec_extract & vec_interleave tree codes.
>   Change vec_perm checking and expansion level.
>   Implement interleave via permutation.
>   spu: Remove vec_extract_even/odd and vec_interleave expanders.
>   rs6000: Remove some vec_extract_even/odd expanders.
>   i386: Delete the vec_extract_even/odd patterns.

This patch set causes a build failure on SPU:

/home/uweigand/fsf/gcc-head/libgfortran/generated/matmul_c4.c: In function 
'matmul_c4':
/home/uweigand/fsf/gcc-head/libgfortran/generated/matmul_c4.c:284:25: internal 
compiler error: in expand_vec_perm, at optabs.c:6998

  /* Multiply each element by its byte size.  */
  if (u == 2)
sel = expand_simple_binop (mode, PLUS, sel, sel, sel, 0, OPTAB_DIRECT);
  else
sel = expand_simple_binop (mode, ASHIFT, sel, GEN_INT (exact_log2 (u)),
   sel, 0, OPTAB_DIRECT);
  gcc_assert (sel != NULL);

This fails since for u == 4 and mode == V4SFmode it attempts to expand
a V4SFmode shift, which is unsupported.

Shouldn't this be using the mode of the selector rather than the mode
of the result in any case?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz

Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

> well, if such a function is used as inline and we know for it that j has 
> value != 2, then we have here a big difference.  For your first example, 
> we still have to do the memory access to *i, even if we are not 
> interested in result.

Actually we don't have to preserve memory accesses.  The interesting case 
is if the pointer has an invalid value.  The behaviour of the access then 
is undefined, and it's okay to not do it at all.  In case the pointer does 
point to an object the access (if it's value isn't needed) also isn't 
necessary.  IOW: in "void f(int *p) { int i = *p; }" we can always remove 
the pointer read.  So, I still maintain that the transformation on the 
original example was okay.

Ciao,
Michael.

Re: [PATCH][ARM] Big Endian and Generic tuning

2011-10-26 Thread Richard Earnshaw

On 26/10/11 14:54, Andrew Stubbs wrote:
> On 25/10/11 15:16, Richard Earnshaw wrote:
>> The difficulty on R profile is that although BE-32 mode is obsolete,
>> there is a config bit that can be set to make the CPU byte-swap the
>> instruction stream to make it behave largely like it is running in BE-32
>> mode.
>>
>> I think really we should be pushing the R-profile code towards having
>> BE-8 as the default; but to do that we really need a compiler option
>> that can override this behaviour (probably -mbe-[8|32]).  Once we have
>> that option, we can fix the compiler to prefer BE-8 as there is then a
>> way to get back the legacy behaviour.
> 
> Here's an updated patch that makes no generalizations.
> 
> OK?
> 

Yep

R.

> Andrew
> 
> 
> be.patch
> 
> 
> 2011-10-26  Andrew Stubbs  
> 
>   gcc/
>   * config/arm/bpabi.h (BE8_LINK_SPEC): Recognize generic-armv7 tuning.
> 
> --- a/gcc/config/arm/bpabi.h
> +++ b/gcc/config/arm/bpabi.h
> @@ -58,6 +58,7 @@
>  #define BE8_LINK_SPEC \
>" %{mbig-endian:%{march=armv7-a|mcpu=cortex-a5 \
> |mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15\
> +   |mcpu=generic-armv7-a \
> |march=armv7-m|mcpu=cortex-m3 \
> |march=armv7e-m|mcpu=cortex-m4\
> |march=armv6-m|mcpu=cortex-m0 \

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Bernd Schmidt

On 10/26/11 15:54, Alan Modra wrote:
> On Wed, Oct 26, 2011 at 03:01:01PM +0200, Bernd Schmidt wrote:
>> On 10/26/11 14:27, Alan Modra wrote:
>>> Committed revision 180522.  It turns out that shrink-wrapping isn't as
>>> effective as it used to be with the 20110915 based sources I was using
>>> originally.  povray Ray_In_Bound no longer gets the benefit of shrink
>>> wrap, likely due to some cfg optimization.  We end up with a simple
>>> block that just does r3=1 then jumps to last_bb being reached from
>>> blocks that need a prologue as well as blocks that don't.  That's
>>> enough to kill our current shrink wrap implementation.  What we need
>>> is something to duplicate these tail blocks..
>>
>> Would it work to insert the epilogue on some edges to this R3=1 block,
>> and not on the others?
> 
> Wouldn't you need to modify all the target epilogue code?  Our
> epilogues return.

Not all of them at once - you could require that if a target has a
simple_return pattern, the epilogue does not return. But yes, these
kinds of complications are a reason why I went for a simple variant first.

> I guess the tradeoff between the classic shrink-wrap epilogue scheme
> and my duplicate tail idea is whether duplicating tail blocks adds
> more code than duplicating epilogues.  From what I've seen, the
> duplicate tails are generally very small.  I guess I should dump out
> some info so we can get a better idea.

I suppose if one wanted to avoid inserting more than one epilogue for
code-size reasons, one could make a new basic block containing the
epilogue, and redirect edges as appropriate.


Bernd

Re: [PATCH][ARM] Big Endian and Generic tuning

2011-10-26 Thread Andrew Stubbs


On 25/10/11 15:16, Richard Earnshaw wrote:

The difficulty on R profile is that although BE-32 mode is obsolete,
there is a config bit that can be set to make the CPU byte-swap the
instruction stream to make it behave largely like it is running in BE-32
mode.

I think really we should be pushing the R-profile code towards having
BE-8 as the default; but to do that we really need a compiler option
that can override this behaviour (probably -mbe-[8|32]).  Once we have
that option, we can fix the compiler to prefer BE-8 as there is then a
way to get back the legacy behaviour.


Here's an updated patch that makes no generalizations.

OK?

Andrew
2011-10-26  Andrew Stubbs  

	gcc/
	* config/arm/bpabi.h (BE8_LINK_SPEC): Recognize generic-armv7 tuning.

--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -58,6 +58,7 @@
 #define BE8_LINK_SPEC \
   " %{mbig-endian:%{march=armv7-a|mcpu=cortex-a5	\
|mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15	\
+   |mcpu=generic-armv7-a\
|march=armv7-m|mcpu=cortex-m3			\
|march=armv7e-m|mcpu=cortex-m4			\
|march=armv6-m|mcpu=cortex-m0			\

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra

On Wed, Oct 26, 2011 at 03:01:01PM +0200, Bernd Schmidt wrote:
> On 10/26/11 14:27, Alan Modra wrote:
> > Committed revision 180522.  It turns out that shrink-wrapping isn't as
> > effective as it used to be with the 20110915 based sources I was using
> > originally.  povray Ray_In_Bound no longer gets the benefit of shrink
> > wrap, likely due to some cfg optimization.  We end up with a simple
> > block that just does r3=1 then jumps to last_bb being reached from
> > blocks that need a prologue as well as blocks that don't.  That's
> > enough to kill our current shrink wrap implementation.  What we need
> > is something to duplicate these tail blocks..
> 
> Would it work to insert the epilogue on some edges to this R3=1 block,
> and not on the others?

Wouldn't you need to modify all the target epilogue code?  Our
epilogues return.

> (How many edges of each kind are there?)
In the povray case there was one edge of each kind, but I have seen
other cases where there were 4 edges from blocks needing no prologue
and 2 edges from blocks needing a prologue.  I can't tell you what the
testcase was now;  It was something I looked at when ironing out bugs
in my code.  You wouldn't believe how many ways it is possible to
write buggy cfg manipulation code..

I guess the tradeoff between the classic shrink-wrap epilogue scheme
and my duplicate tail idea is whether duplicating tail blocks adds
more code than duplicating epilogues.  From what I've seen, the
duplicate tails are generally very small.  I guess I should dump out
some info so we can get a better idea.

-- 
Alan Modra
Australia Development Lab, IBM

Re: PR rtl-optimization/46603

2011-10-26 Thread Eric Botcazou

> Thanks!

You're welcome.  I have also installed the attached patch which makes minor 
tweaks and fixes various issues in comments which have bugged me for years, 
the most glaring one being:

 Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R where
 either M1 is not valid for R or M2 is wider than a word but we only
 need one word to store an M2-sized quantity in R.

which seems to be hinting at some relativistic effect on the size of M2...

No functional changes, tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  

* reload.c (reload_inner_reg_of_subreg): Change type of return value
and type of OUTPUT parameter to bool and adjust.  Document MODE and
OUTPUT parameters.  Use HARD_REGISTER_P.  Reorder final condition
and improve associated comment.
(push_reload): Clarify and update comments about reloading of subregs.
Adjust calls to reload_inner_reg_of_subreg.  Compute the class upfront
for the reloading of subregs in the out case as well.


-- 
Eric Botcazou
Index: reload.c
===
--- reload.c	(revision 180457)
+++ reload.c	(working copy)
@@ -256,7 +256,6 @@ static int push_secondary_reload (int, r
   enum insn_code *, secondary_reload_info *);
 static enum reg_class find_valid_class (enum machine_mode, enum machine_mode,
 	int, unsigned int);
-static int reload_inner_reg_of_subreg (rtx, enum machine_mode, int);
 static void push_replacement (rtx *, int, enum machine_mode);
 static void dup_replacements (rtx *, rtx *);
 static void combine_reloads (void);
@@ -791,39 +790,39 @@ find_reusable_reload (rtx *p_in, rtx out
   return n_reloads;
 }
 
-/* Return nonzero if X is a SUBREG which will require reloading of its
-   SUBREG_REG expression.  */
+/* Return true if X is a SUBREG that will need reloading of its SUBREG_REG
+   expression.  MODE is the mode that X will be used in.  OUTPUT is true if
+   the function is invoked for the output part of an enclosing reload.  */
 
-static int
-reload_inner_reg_of_subreg (rtx x, enum machine_mode mode, int output)
+static bool
+reload_inner_reg_of_subreg (rtx x, enum machine_mode mode, bool output)
 {
   rtx inner;
 
   /* Only SUBREGs are problematical.  */
   if (GET_CODE (x) != SUBREG)
-return 0;
+return false;
 
   inner = SUBREG_REG (x);
 
-  /* If INNER is a constant or PLUS, then INNER must be reloaded.  */
+  /* If INNER is a constant or PLUS, then INNER will need reloading.  */
   if (CONSTANT_P (inner) || GET_CODE (inner) == PLUS)
-return 1;
+return true;
 
-  /* If INNER is not a hard register, then INNER will not need to
- be reloaded.  */
-  if (!REG_P (inner)
-  || REGNO (inner) >= FIRST_PSEUDO_REGISTER)
-return 0;
+  /* If INNER is not a hard register, then INNER will not need reloading.  */
+  if (!(REG_P (inner) && HARD_REGISTER_P (inner)))
+return false;
 
   /* If INNER is not ok for MODE, then INNER will need reloading.  */
-  if (! HARD_REGNO_MODE_OK (subreg_regno (x), mode))
-return 1;
+  if (!HARD_REGNO_MODE_OK (subreg_regno (x), mode))
+return true;
 
-  /* If the outer part is a word or smaller, INNER larger than a
- word and the number of regs for INNER is not the same as the
- number of words in INNER, then INNER will need reloading.  */
-  return (GET_MODE_SIZE (mode) <= UNITS_PER_WORD
-	  && output
+  /* If this is for an output, and the outer part is a word or smaller,
+ INNER is larger than a word and the number of registers in INNER is
+ not the same as the number of words in INNER, then INNER will need
+ reloading (with an in-out reload).  */
+  return (output
+	  && GET_MODE_SIZE (mode) <= UNITS_PER_WORD
 	  && GET_MODE_SIZE (GET_MODE (inner)) > UNITS_PER_WORD
 	  && ((GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD)
 	  != (int) hard_regno_nregs[REGNO (inner)][GET_MODE (inner)]));
@@ -990,9 +989,9 @@ push_reload (rtx in, rtx out, rtx *inloc
  For machines that extend byte loads, do this for any SUBREG of a pseudo
  where both M1 and M2 are a word or smaller, M1 is wider than M2, and
  M2 is an integral mode that gets extended when loaded.
- Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R where
- either M1 is not valid for R or M2 is wider than a word but we only
- need one word to store an M2-sized quantity in R.
+ Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R
+ where either M1 is not valid for R or M2 is wider than a word but we
+ only need one register to store an M2-sized quantity in R.
  (However, if OUT is nonzero, we need to reload the reg *and*
  the subreg, so do nothing here, and let following statement handle it.)
 
@@ -1082,17 +1081,16 @@ push_reload (rtx in, rtx out, rtx *inloc
   inmode = GET_MODE (in);
 }
 
-  /* Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard regist

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz

2011/10/26 Michael Matz :
> Hi,
>
> On Wed, 26 Oct 2011, Kai Tietz wrote:
>
>> >> > int f(char *i, int j)
>> >> > {
>> >> >        if (*i && j!=2)
>> >> >                return *i;
>> >> >        else
>> >> >                return j;
>> >> > }
>> >>
>>
>> the case can be produced quite easily.
>>
>> extern int global = 0;
>>
>> 
>>   if (*a && global) ...
>
> See?  You had to change the program to prove the transformation to be
> invalid.  But my point was that the function we discuss about was exactly
> as above.  It didn't have globals, or two loads, or a volatile, or
> anything else you can come up with where the transformation would be
> detectable (and hence invalid).  I claim that you can't construct a
> program that can distinguish between this function:
>
> int f(char *i, int j)
> {
>   if (*i && j!=2)
>     return *i;
>   else
>     return j;
> }
>
> and this one:
>
> int f(char *i, int j)
> {
>   if (*i & j!=2)
>     return *i;
>   else
>     return j;
> }
>
> And if you can't construct such a program, then the initial transformation
> before the fold-const.c change _for this specific situation_ was correct.
>
>
> Ciao,
> Michael.

well, if such a function is used as inline and we know for it that j
has value != 2, then we have here a big difference.  For your first
example, we still have to do the memory access to *i, even if we are
not interested in result.  See here point 4 of 5.1.2.3 of C-spec.
For your second sample we don't need to do that, as the & itself is no
sequence-point and so we can eliminate the *i access without breaking
anything.

Regards,
Kai

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor

Rainer Orth  writes:

> There's one problem left: with Solaris nawk, building libcalls.go fails
> (from Solaris 8 to 11 inclusive):
>
> nawk -f /vol/gcc/src/hg/trunk/local/libgo/go/syscall/mksyscall.awk ${files} > 
> li
> bcalls.go.tmp
> nawk: syntax error at source line 47
>  context is
> if (match($0, "//sys(nb)?[  ]*([a-zA-Z0-9_]+)\\(([^()]*)\\) >>>  
> *(\
> \(([^()]+)\\))?", <<<  gosig) == 0) {
> nawk: illegal statement at source line 47
> nawk: syntax error at source line 58
> missing }
> make[4]: *** [s-libcalls] Error 2
>
> I don't yet see what's wrong with the pattern, and gawk does accept it.

Hmmm, I don't have a copy of nawk.  Is it possible that it doesn't like
" *" == *?  You could try "[ ]*".

Ian

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz

Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

> >> > int f(char *i, int j)
> >> > {
> >> >        if (*i && j!=2)
> >> >                return *i;
> >> >        else
> >> >                return j;
> >> > }
> >>
> 
> the case can be produced quite easily.
> 
> extern int global = 0;
> 
> 
>   if (*a && global) ...

See?  You had to change the program to prove the transformation to be 
invalid.  But my point was that the function we discuss about was exactly 
as above.  It didn't have globals, or two loads, or a volatile, or 
anything else you can come up with where the transformation would be 
detectable (and hence invalid).  I claim that you can't construct a 
program that can distinguish between this function:

int f(char *i, int j)
{
  if (*i && j!=2)
    return *i;
  else
    return j;
}

and this one:

int f(char *i, int j)
{
  if (*i & j!=2)
    return *i;
  else
    return j;
}

And if you can't construct such a program, then the initial transformation 
before the fold-const.c change _for this specific situation_ was correct.

Ciao,
Michael.

[PATCH][RFC] Re-write LTO option merging

2011-10-26 Thread Richard Guenther


This completely rewrites LTO option merging.  At compile (uselessly
now at WPA?) time we now stream a COLLECT_GCC_OPTIONS like string
as it comes from argv of the compiler binary.  Those options are
read in by the LTO driver (lto-wrapper), merged into a single
set (very simple merge function right now ;)) and given a place to
complain about incompatible arguments.  The merged set is then
prepended to the arguments from the linker driver line
(what we get in COLLECT_GCC_OPTIONS for lto-wrapper), thus the
linker command-line may override what the compiler command-line(s)
provided.

One visible change is that no optimization option on the link line
no longer means -O0, unless you explicitly specify -O0 at link time.

There are probably more obscure differences, especially due to the
very simple merge and complain function ;))  But this is a RFC ...

If WPA partitioning at any point wants to do something clever with
a set of incompatible functions it can re-parse the options and
do that (we then have to arrange for lto-wrapper to let the options
slip through).

I'm LTO bootstrapping and testing this simple variant right now
(I believe we do not excercise funny option combinations right now).

I'll still implement a very simple merge/complain function.
Suggestions for that welcome (I'll probably simply compute the
intersection of options ... in the long run we'd want to annotate
our options as to whether they should be unioned/intersected).

Any comments?

Thanks,
Richard.

2011-10-26  Richard Guenther  

* lto-opts.c: Re-implement.
* lto-streamer.h (lto_register_user_option): Remove.
(lto_read_file_options): Likewise.
(lto_reissue_options): Likewise.
(lto_clear_user_options): Likewise.
(lto_clear_file_options): Likewise.
* opts-global.c (post_handling_callback): Remove LTO specific code.
(decode_options): Likewise.
* lto-wrapper.c (run_gcc): Read all input file options and
prepend a merged set before the linker driver options.

lto/
* lto-lang.c (lto_post_options): Do not read file options.
* lto.c (lto_read_all_file_options): Remove.
(lto_init): Call lto_set_in_hooks here.


Index: trunk/gcc/lto-opts.c
===
*** trunk.orig/gcc/lto-opts.c   2011-10-26 13:46:24.0 +0200
--- trunk/gcc/lto-opts.c2011-10-26 15:06:29.0 +0200
*** along with GCC; see the file COPYING3.
*** 33,280 
  #include "common/common-target.h"
  #include "diagnostic.h"
  #include "lto-streamer.h"
  
! /* When a file is initially compiled, the options used when generating
!the IL are not necessarily the same as those used when linking the
!objects into the final executable.  In general, most build systems
!will proceed with something along the lines of:
! 
!   $ gcc  -flto -c f1.c -o f1.o
!   $ gcc  -flto -c f2.c -o f2.o
!   ...
!   $ gcc  -flto -c fN.c -o fN.o
! 
!And the final link may or may not include the same  used
!to generate the initial object files:
! 
!   $ gcc  -flto -o prog f1.o ... fN.o
! 
!Since we will be generating final code during the link step, some
!of the flags used during the compile step need to be re-applied
!during the link step.  For instance, flags in the -m family.
! 
!The idea is to save a selected set of  in a special
!section of the initial object files.  This section is then read
!during linking and the options re-applied.
! 
!FIXME lto.  Currently the scheme is limited in that only the
!options saved on the first object file (f1.o) are read back during
!the link step.  This means that the options used to compile f1.o
!will be applied to ALL the object files in the final link step.
!More work needs to be done to implement a merging and validation
!mechanism, as this will not be enough for all cases.  */
! 
! /* Saved options hold the type of the option (currently CL_TARGET or
!CL_COMMON), and the code, argument, and value.  */
! 
! typedef struct GTY(()) opt_d
! {
!   unsigned int type;
!   size_t code;
!   char *arg;
!   int value;
! } opt_t;
! 
! DEF_VEC_O (opt_t);
! DEF_VEC_ALLOC_O (opt_t, heap);
! 
! 
! /* Options are held in two vectors, one for those registered by
!command line handling code, and the other for those read in from
!any LTO IL input.  */
! static VEC(opt_t, heap) *user_options = NULL;
! static VEC(opt_t, heap) *file_options = NULL;
! 
! /* Iterate FROM in reverse, writing option codes not yet in CODES into *TO.
!Mark each new option code encountered in CODES.  */
  
  static void
! reverse_iterate_options (VEC(opt_t, heap) *from, VEC(opt_t, heap) **to,
!bitmap codes)
! {
!   int i;
! 
!   for (i = VEC_length (opt_t, from); i > 0; i--)
! {
!   const opt_t *const o = VEC_index (opt_t, from, i - 1);
! 
!   if (bitmap_set_bit (codes, o->code))
!

Re: [cxx-mem-model] Generic atomic functions

2011-10-26 Thread Andrew MacLeod


On 10/26/2011 08:29 AM, Michael Matz wrote:

Hi,

On Tue, 25 Oct 2011, Andrew MacLeod wrote:


and new generic routines are provided as follows:

void __atomic_load (T* object, T* return_value, memory_order m)
void __atomic_store (T* object, T* new_value, memory_order m)
void __atomic_exchange (T* object, T* new_value, T* return_value, memory_order
m)
void __atomic_compare_exchange (T* object, T* expected_value, T* new_value,
bool weak, memory_order success, memory_order fail)

When a generic routine is called with an object whose size maps to one
of the type specific built-ins, (ie sizeof (T) == 1,2,4,8 or 16) the
generic call is translated into the appropriate direct call.

Are the generic routines really generic?  In particular do they accept
objects that aren't naturally aligned and are supposed to still work?  In
that case you can't rewrite them into the type specific builtins after
only checking the size, you also need to check alignment.




yes, ive been considering that.   At the moment, the claim is it is well 
defined for properly aligned objects, undefined otherwise.


I've been considering what would be involved in defining it. And I not 
sure if that should wait until the next release, or whether we should 
try now.


  - The 5 size specific routines *must* be properly aligned. (ie, and N 
byte object must have the same alignment as the N byte integral)


for the generic routines:
  - if it can be determined at compile time that the object being 
pointed to is properly aligned, then it can be mapped into one of the 5 
size specific routines.  Otherwise it must remain a call into the 
runtime library.
  - The runtime library can look at the pointer value and determine if 
things are properly aligned based on the pointer value.  If properly 
aligned, then it could invoke one of the runtime lock-free routines.
 - If an improperly aligned pointer is passed in, the fall back would 
be to resort to what will probably be a locked implementation for the 
object.


  - the __atomic_is_lock_free(size) routine will have to have an 
additional object pointer parameter. If its NULL, it will return a value 
based on properly aligned objects of SIZE.  Otherwise, it will also 
check the value of the pointer to see if it is properly aligned and then 
return true or false based on whether one of the lock-free routines will 
be invoked.


So it would mean that an 8 byte object which is properly aligned could 
be lock free, while and unaligned 8 byte object would likely have a 
locked implementation.  They would all work though, just requiring the 
library.


Thats my thoughts...  I'm tempted to implement the alignment 
checking...  It seems more complete, but it is more complex. so I waffle :-)


any other opinions?

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Bernd Schmidt

On 10/26/11 14:27, Alan Modra wrote:
> Committed revision 180522.  It turns out that shrink-wrapping isn't as
> effective as it used to be with the 20110915 based sources I was using
> originally.  povray Ray_In_Bound no longer gets the benefit of shrink
> wrap, likely due to some cfg optimization.  We end up with a simple
> block that just does r3=1 then jumps to last_bb being reached from
> blocks that need a prologue as well as blocks that don't.  That's
> enough to kill our current shrink wrap implementation.  What we need
> is something to duplicate these tail blocks..

Would it work to insert the epilogue on some edges to this R3=1 block,
and not on the others? (How many edges of each kind are there?)

Now that we have an initial patch in the tree and it mostly seems to
work, we can think about making it a little stronger - the initial
implementation is really quite conservative.

Bernd

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz

I describe the sample more closely here

extern int global = 0;
extern int *a = NULL;

void catchSigSegV( int sig )
{
  a = &global;
 }

int foo (int j)
{
 signal (SIGSEGV, catchSigSegV);
 if (*a && global) return 2;
 return 0;
}

I admit that in most cases such a scenario is not common.  This sample
seems to be a valid C program.  So the conditions in IF shall be
evaluted strict in order of sequence-points, as first argument might
trap.
It doesn't matter if second argument have side-effects or none.  The
point is the first and so it has to be separated from other
conditions.

Regards,
Kai

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz

2011/10/26 Michael Matz :
> Hi,
>
> On Wed, 26 Oct 2011, Kai Tietz wrote:
>
>> > Yes, this part introduced different behavior for this small case,
>> >
>> > int f(char *i, int j)
>> > {
>> >        if (*i && j!=2)
>> >                return *i;
>> >        else
>> >                return j;
>> > }
>>
>> Well, as far as I understand C specification and sequence points, it
>> makes indeed a difference to do branch-cost optimization on instructions
>> might cause a signal traps. As signal could be handled in specifc
>> handlers. You need to consider here that short-circuit optimization
>> assumes associative law on operands.  So right-hand side might be
>> evaluaded before the left-hand-side.  And this indeed breaks IMHO the
>> sequences as mentioned in C specification about conditions.
>
> I'm not sure in this specific case.  If it had been:
>
> if (*a && *b) ...
>
> the you'd be right.  The side-effect of dereferencing a must happen before
> *b, and hence transforming this into (*a!=0)&(*b!=0) would be wrong.  But
> in the case at hand we only have one potentially problematic (aka
> detectable) side-effect (*i), so I don't think you could construct a
> program that detects the difference.  It would entail detecting that
> "j!=2" was already evaluated, when (say) the segfault happens, but you
> can't as the variable is non-volatile.  It also can't happen that the
> side-effect *i does not occur (which also would be a problem).  So, I
> think there wasn't an actual problem before, and it had fewer jumps.
>
>
> Ciao,
> Michael.

the case can be produced quite easily.

Eg:

extern int global = 0;


  if (*a && global) ...
...

the issue is that by C-specification see here 5.1.2.2.3 about
program-termination.The important point is here::

Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete *** and no side
effects of subsequent evaluations shall have taken place ***

Annex C describes sequence-points as

1 The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND
&& (6.5.13); logical OR || (6.5.14); conditional ? (6.5.15); comma ,
(6.5.17).
...

Regards,
Kai

Re: [cxx-mem-model] Generic atomic functions

2011-10-26 Thread Michael Matz

Hi,

On Tue, 25 Oct 2011, Andrew MacLeod wrote:

> and new generic routines are provided as follows:
> 
> void __atomic_load (T* object, T* return_value, memory_order m)
> void __atomic_store (T* object, T* new_value, memory_order m)
> void __atomic_exchange (T* object, T* new_value, T* return_value, memory_order
> m)
> void __atomic_compare_exchange (T* object, T* expected_value, T* new_value,
> bool weak, memory_order success, memory_order fail)
> 
> When a generic routine is called with an object whose size maps to one 
> of the type specific built-ins, (ie sizeof (T) == 1,2,4,8 or 16) the 
> generic call is translated into the appropriate direct call.

Are the generic routines really generic?  In particular do they accept 
objects that aren't naturally aligned and are supposed to still work?  In 
that case you can't rewrite them into the type specific builtins after 
only checking the size, you also need to check alignment.

Ciao,
Michael.

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra

On Sun, Oct 16, 2011 at 02:51:01PM -0400, David Edelsohn wrote:
> The patch is okay, although I am not thrilled about the need to change
> the register allocation order.

Committed revision 180522.  It turns out that shrink-wrapping isn't as
effective as it used to be with the 20110915 based sources I was using
originally.  povray Ray_In_Bound no longer gets the benefit of shrink
wrap, likely due to some cfg optimization.  We end up with a simple
block that just does r3=1 then jumps to last_bb being reached from
blocks that need a prologue as well as blocks that don't.  That's
enough to kill our current shrink wrap implementation.  What we need
is something to duplicate these tail blocks..

Patch here for comment.  I haven't yet run benchmarks, but this passes
bootstrap and regression test (on rev 180286, current virgin mainline
fails bootstrap on powerpc-linux).

* function.c (thread_prologue_and_epilogue_insns): Delete
shadowing variables.  Don't do prologue register clobber tests
when shrink wrapping already failed.  Compute tail block
candidates for duplicating exit path.  Remove these from
antic set.  Duplicate tails when reached from both blocks
needing a prologue/epilogue and blocks not needing such.

Index: gcc/function.c
===
--- gcc/function.c  (revision 180467)
+++ gcc/function.c  (working copy)
@@ -5697,11 +5697,11 @@ thread_prologue_and_epilogue_insns (void
   HARD_REG_SET prologue_clobbered, prologue_used, live_on_edge;
   HARD_REG_SET set_up_by_prologue;
   rtx p_insn;
-
   VEC(basic_block, heap) *vec;
   basic_block bb;
   bitmap_head bb_antic_flags;
   bitmap_head bb_on_list;
+  bitmap_head bb_tail;
 
   if (dump_file)
fprintf (dump_file, "Attempting shrink-wrapping optimization.\n");
@@ -5732,6 +5732,7 @@ thread_prologue_and_epilogue_insns (void
 
   bitmap_initialize (&bb_antic_flags, &bitmap_default_obstack);
   bitmap_initialize (&bb_on_list, &bitmap_default_obstack);
+  bitmap_initialize (&bb_tail, &bitmap_default_obstack);
 
   /* Find the set of basic blocks that require a stack frame.  */
 
@@ -5774,19 +5775,22 @@ thread_prologue_and_epilogue_insns (void
}
}
 
+  /* Save a copy of blocks that really need a prologue.  */
+  bitmap_copy (&bb_antic_flags, &bb_flags);
+
   /* For every basic block that needs a prologue, mark all blocks
 reachable from it, so as to ensure they are also seen as
 requiring a prologue.  */
   while (!VEC_empty (basic_block, vec))
{
  basic_block tmp_bb = VEC_pop (basic_block, vec);
- edge e;
- edge_iterator ei;
+
  FOR_EACH_EDGE (e, ei, tmp_bb->succs)
if (e->dest != EXIT_BLOCK_PTR
&& bitmap_set_bit (&bb_flags, e->dest->index))
  VEC_quick_push (basic_block, vec, e->dest);
}
+
   /* If the last basic block contains only a label, we'll be able
 to convert jumps to it to (potentially conditional) return
 insns later.  This means we don't necessarily need a prologue
@@ -5799,14 +5803,29 @@ thread_prologue_and_epilogue_insns (void
goto fail_shrinkwrap;
}
 
+  /* Find the set of basic blocks that need no prologue and only
+go to the exit.  */
+  bitmap_set_bit (&bb_tail, EXIT_BLOCK_PTR->index);
+  VEC_quick_push (basic_block, vec, EXIT_BLOCK_PTR);
+  while (!VEC_empty (basic_block, vec))
+   {
+ basic_block tmp_bb = VEC_pop (basic_block, vec);
+
+ FOR_EACH_EDGE (e, ei, tmp_bb->preds)
+   if (single_succ_p (e->src)
+   && !bitmap_bit_p (&bb_antic_flags, e->src->index)
+   && bitmap_set_bit (&bb_tail, e->src->index))
+ VEC_quick_push (basic_block, vec, e->src);
+   }
+
   /* Now walk backwards from every block that is marked as needing
-a prologue to compute the bb_antic_flags bitmap.  */
-  bitmap_copy (&bb_antic_flags, &bb_flags);
+a prologue to compute the bb_antic_flags bitmap.  Exclude
+tail blocks; They can be duplicated to be used on paths not
+needing a prologue.  */
+  bitmap_and_compl (&bb_antic_flags, &bb_flags, &bb_tail);
   FOR_EACH_BB (bb)
{
- edge e;
- edge_iterator ei;
- if (!bitmap_bit_p (&bb_flags, bb->index))
+ if (!bitmap_bit_p (&bb_antic_flags, bb->index))
continue;
  FOR_EACH_EDGE (e, ei, bb->preds)
if (!bitmap_bit_p (&bb_antic_flags, e->src->index)
@@ -5816,8 +5835,6 @@ thread_prologue_and_epilogue_insns (void
   while (!VEC_empty (basic_block, vec))
{
  basic_block tmp_bb = VEC_pop (basic_block, vec);
- edge e;
- edge_iterator ei;
  bool all_set = true;
 
  bitmap_clear_bit (&bb_on_list, tmp_bb->index);
@@ -5862,28 +5879,172

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Rainer Orth

Ian,

> I committed this patch to mainline to try to fix these problems.  Thanks
> for testing.  For this patch I bootstrapped and ran Go testsuite on
> x86_64-unknown-linux-gnu.

with this patch, go and libgo results on Solaris 10 and 11/x86 are back
to normal, and Solaris 10 and 11/SPARC bootstraps are currently
running.  Thanks.

There's one problem left: with Solaris nawk, building libcalls.go fails
(from Solaris 8 to 11 inclusive):

nawk -f /vol/gcc/src/hg/trunk/local/libgo/go/syscall/mksyscall.awk ${files} > li
bcalls.go.tmp
nawk: syntax error at source line 47
 context is
if (match($0, "//sys(nb)?[  ]*([a-zA-Z0-9_]+)\\(([^()]*)\\) >>>  *(\
\(([^()]+)\\))?", <<<  gosig) == 0) {
nawk: illegal statement at source line 47
nawk: syntax error at source line 58
missing }
make[4]: *** [s-libcalls] Error 2

I don't yet see what's wrong with the pattern, and gawk does accept it.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz

Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

> > Yes, this part introduced different behavior for this small case,
> >
> > int f(char *i, int j)
> > {
> >        if (*i && j!=2)
> >                return *i;
> >        else
> >                return j;
> > }
> 
> Well, as far as I understand C specification and sequence points, it 
> makes indeed a difference to do branch-cost optimization on instructions 
> might cause a signal traps. As signal could be handled in specifc 
> handlers. You need to consider here that short-circuit optimization 
> assumes associative law on operands.  So right-hand side might be 
> evaluaded before the left-hand-side.  And this indeed breaks IMHO the 
> sequences as mentioned in C specification about conditions.

I'm not sure in this specific case.  If it had been:

if (*a && *b) ...

the you'd be right.  The side-effect of dereferencing a must happen before 
*b, and hence transforming this into (*a!=0)&(*b!=0) would be wrong.  But 
in the case at hand we only have one potentially problematic (aka 
detectable) side-effect (*i), so I don't think you could construct a 
program that detects the difference.  It would entail detecting that 
"j!=2" was already evaluated, when (say) the segfault happens, but you 
can't as the variable is non-volatile.  It also can't happen that the 
side-effect *i does not occur (which also would be a problem).  So, I 
think there wasn't an actual problem before, and it had fewer jumps.

Ciao,
Michael.

RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz

Hi,

On Wed, 26 Oct 2011, Jiangning Liu wrote:

> > > >> -
> > > >> -  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
> > > >> - {
> > > >> -   if (code != orig_code || lhs != orig_lhs || rhs !=
> > orig_rhs)
> > > >> - return build2_loc (loc, code, truth_type, lhs, rhs);
> > > >> -   return NULL_TREE;
> > > >> - }
> > > >
> > > > Why do you remove this hunk?  Shouldn't you instead move the hunk
> > you
> > > > added to fold_truth_andor() here.  I realize this needs some TLC to
> > > > fold_truth_andor_1, because right now it early-outs for non-
> > comparisons,
> > > > but it seems the better place.  I.e. somehow move the below code
> > into the
> > > > above branch, with the associated diddling on fold_truth_andor_1
> > that it
> > > > gets called.
> > >
> > > This hunk is removed, as it is vain to do here.
> > 
> > There is a fallthrough now, that wasn't there before.  I don't know if
> > it's harmless, I just wanted to mention it.
> > 
> 
> Yes, this part introduced different behavior for this small case,
> 
>   D.4710 = *i;
>   D.4711 = D.4710 != 0;
>   D.4712 = j != 2;
>   D.4713 = D.4711 & D.4712;
>   if (D.4713 != 0) goto ; else goto ;
> 
> After the fix, we have
> 
>   D.4711 = *i;
>   if (D.4711 != 0) goto ; else goto ;
>   :
>   if (j != 2) goto ; else goto ;

So, we have one more jump than originally, when the point of the patch 
was to emit less on targets with high branch costs.  So, as speculated, 
the hunk was not useless.  (It's nice that it caused a benchmark to 
improve significantly, but that should be done via a proper analysis and 
patch, not as a side effect of a supposed non-change).


Ciao,
Michael.

[PATCH] Fix lto-wrapper obstack handling

2011-10-26 Thread Richard Guenther


Seems only in some cases we fault freeing an uninitialized obstack.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-10-26  Richard Guenther  

* lto-wrapper.c (run_gcc): Properly init/free obstack.

Index: gcc/lto-wrapper.c
===
*** gcc/lto-wrapper.c   (revision 180520)
--- gcc/lto-wrapper.c   (working copy)
*** run_gcc (unsigned argc, char *argv[])
*** 339,345 
char *list_option_full = NULL;
const char *linker_output = NULL;
const char *collect_gcc, *collect_gcc_options;
-   struct obstack env_obstack;
int parallel = 0;
int jobserver = 0;
bool no_partition = false;
--- 339,344 
*** run_gcc (unsigned argc, char *argv[])
*** 517,527 
--- 516,528 
  {
FILE *stream = fopen (ltrans_output_file, "r");
FILE *mstream = NULL;
+   struct obstack env_obstack;
  
if (!stream)
fatal_perror ("fopen: %s", ltrans_output_file);
  
/* Parse the list of LTRANS inputs from the WPA stage.  */
+   obstack_init (&env_obstack);
nr = 0;
for (;;)
{
*** cont:
*** 574,580 
  
  /* Replace the .o suffix with a .ltrans.o suffix and write
 the resulting name to the LTRANS output list.  */
- obstack_init (&env_obstack);
  obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
  obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
  output_name = XOBFINISH (&env_obstack, char *);
--- 575,580 
*** cont:
*** 654,662 
free (output_names);
free (input_names);
free (list_option_full);
  }
- 
-   obstack_free (&env_obstack, NULL);
  }
  
  
--- 654,661 
free (output_names);
free (input_names);
free (list_option_full);
+   obstack_free (&env_obstack, NULL);
  }
  }

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Iain Sandoe


Hi Richard,

On 25 Oct 2011, at 01:17, Richard Henderson wrote:


The Idea with this patch set is to re-arrange vector permutation
so that it can be used to implement other patterns automatically.

In particular, Altivec, SPU currently have (and Sparc VIS would need)
a large amount of boilerplate code that transforms several higher
level tree codes into vector permutation inside the backend.  This
seems a bit unnecessary.

It's arguable that we could simply make VEC_PERM_EXPR the only code
that gets generated at the vectorizer level, and drop some of these
other patterns entirely.  I'm not 100% sure we should go that far,
but even if we did, I still think this is a good cleanup.

Bootstrapped and tested on x86_64-linux only.  I've only cross-tested
the ppc/spu patches.


unfortunately, the series breaks bootstrap for powerpc-darwin9:

http://gcc.gnu.org/ml/gcc-regression/2011-10/msg00539.html

the failure is in building libgfortran:

libtool: compile:  /GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7- 
trunk-build/./gcc/ -B/GCC/gcc-4-7-install/powerpc-apple-darwin9/bin/ - 
B/GCC/gcc-4-7-install/powerpc-apple-darwin9/lib/ -isystem /GCC/gcc-4-7- 
install/powerpc-apple-darwin9/include -isystem /GCC/gcc-4-7-install/ 
powerpc-apple-darwin9/sys-include -DHAVE_CONFIG_H -I. -I/GCC/gcc-live- 
trunk/libgfortran -iquote/GCC/gcc-live-trunk/libgfortran/io -I/GCC/gcc- 
live-trunk/libgfortran/../gcc -I/GCC/gcc-live-trunk/libgfortran/../gcc/ 
config -I../.././gcc -std=gnu99 -Wall -Wstrict-prototypes -Wmissing- 
prototypes -Wold-style-definition -Wextra -Wwrite-strings -fcx-fortran- 
rules -ffunction-sections -fdata-sections -ftree-vectorize -funroll- 
loops -g -O2 -MT matmul_c4.lo -MD -MP -MF .deps/matmul_c4.Tpo -c /GCC/ 
gcc-live-trunk/libgfortran/generated/matmul_c4.c  -fno-common -DPIC - 
o .libs/matmul_c4.o
/GCC/gcc-live-trunk/libgfortran/generated/matmul_c4.c: In function  
‘matmul_c4’:
/GCC/gcc-live-trunk/libgfortran/generated/matmul_c4.c:284:25: internal  
compiler error: in expand_vec_perm, at optabs.c:6998

Please submit a full bug report,
with preprocessed source if appropriate.

it is reproducible with a stage-1 compiler - so presumably off a cross  
too.


cheers
Iain

Re: [PATCH, PR43814] Assume function arguments of pointer type are aligned.

2011-10-26 Thread Richard Guenther

On Tue, Oct 25, 2011 at 2:22 PM, Tom de Vries  wrote:
> On 09/24/2011 01:42 PM, Richard Guenther wrote:
>> On Sat, Sep 24, 2011 at 11:40 AM, Jakub Jelinek  wrote:
>>> On Sat, Sep 24, 2011 at 11:31:25AM +0200, Richard Guenther wrote:
 In the end I'd probably say the patch is ok without the option (thus
 turned on by default), but if LC_GLOBAL_LOCALE is part of the
 glibc ABI then we clearly can't do this.
>>>
>>> Yes, LC_GLOBAL_LOCALE is part of glibc ABI.  I guess we could only assume
>>> the alignment if the pointer is actually dereferenced on the statement
>>> that checks the ABI or in some stmt that dominates the spot where you want
>>> to check the alignment.  It is IMHO quite common to pass arbitrary values
>>> in pointer types, then cast them back or just compare.
>>
>> Yeah (even if technically invoking undefined behavior in C).  Checking if
>> there is a dereference post-dominating function entry with sth like
>>
>>   FOR_EACH_IMM_USE_STMT (... ptr ...)
>>      if (stmt_post_dominates_entry && contains derefrence of ptr)
>>        alignment = TYPE_ALIGN (...);
>>
>> and otherwise not assuming anything about parameter alignment might work.
>> Be careful to check the alignment of the dereference though,
>>
>> typedef int int_unaligned __attribute__((aligned(1)));
>> int foo (int *p)
>> {
>>   int_unaligned *q = p;
>>   return *q;
>> }
>>
>> will be MEM[p] but with (well, hopefully ;)) TYPE_ALIGN of TREE_TYPE (MEM[p])
>> being 1.  And yes, you'd have to look into handled-components as well.  I 
>> guess
>> you'll face similar problems as we do with tree-sra.c
>> tree_non_mode_aligned_mem_p
>> (you need to assume eventually misaligned accesses the same way expansion
>> does for the dereference, otherwise you'll run into issues on
>> strict-align targets).
>>
>> As that de-refrence thing doesn't really fit the CCP propagation you
>> won't be able
>> to handle
>>
>> int foo (int *p)
>> {
>>   int *q = (char *)p + 3;
>>   return *q;
>> }
>>
>> and assume q is aligned (and p is misaligned by 1).
>>
>> That is, if the definition of a pointer is post-dominated by a derefrence
>> we could assume proper alignment for that pointer (as opposed to just
>> special-casing its default definition).  Would be certainly interesting to
>> see what kind of fallout we would get from that ;)
>>
>
> I gave this a try in deduce_alignment_from_dereferences.
>
> The fall-out I got from this were unaligned dereferenced pointers in
> gcc.c-torture/unsorted/*{cmp,set}.c.
>
> Bootstrapped and reg-tested on x86_64. Build and reg-tested on MIPS and ARM.
>
> Ok for trunk?

Can you not do the get_value_from_alignment split (it doesn't look
necessary to me) and drop the

@@ -541,10 +550,18 @@ get_value_for_expr (tree expr, bool for_
   if (TREE_CODE (expr) == SSA_NAME)
 {
   val = *get_value (expr);
-  if (for_bits_p
- && val.lattice_val == CONSTANT
+  if (!for_bits_p)
+   return val;
+
+  if (val.lattice_val == CONSTANT
  && TREE_CODE (val.value) == ADDR_EXPR)
val = get_value_from_alignment (val.value);
+  else if (val.lattice_val == VARYING
+  && SSA_NAME_PTR_INFO (expr) != NULL
+  && SSA_NAME_PTR_INFO (expr)->align > 1
+  && SSA_NAME_PTR_INFO (expr)->misalign == 0)
+   val = get_align_value (SSA_NAME_PTR_INFO (expr)->align * BITS_PER_UNIT,
+  TREE_TYPE (expr), 0);
 }

hunk?  I'm not sure why it is necessary at all - CCP is the only pass
computing alignment, so it should simply re-compute the info?

Anyway, it looks unrelated to the purpose of the patch in general.

The error reporting in deduce_alignment_from_dereferences is bogus,
the programs are undefined only at runtime, so you can at most
issue a warning.

+  /* Needs to be the successor of entry, for CDI_POST_DOMINATORS.  */
+  entry = single_succ (ENTRY_BLOCK_PTR);
+
+  FOR_EACH_BB (bb)
+{
+  gimple_stmt_iterator i;
+
+  if (!dominated_by_p (CDI_POST_DOMINATORS, entry, bb))
+   continue;

if you only consider post-dominators of the entry block then just walk
them directly (first_dom_son / next_dom_son).

+ align = TYPE_ALIGN (TREE_TYPE (memref)) / BITS_PER_UNIT;
+ if (align == 1)
+   continue;

I think you want to match what expand thinks of the alignment of this
memory reference, not just what TYPE_ALIGN says (and yes, that
needs to be split out somehow, SRA would need this as well).

+ while (TREE_CODE (ptr) == SSA_NAME)
+   {
+ pi = get_ptr_info (ptr);
+ if (pi->misalign != 0)
+   {
+ error ("misaligned pointer dereferenced");
+ break;
+   }

simply looking at pi->misalign is wrong.  pi->align may be bigger
than the align that you computed above, so pi->misalign % align != 0
would be the right check.

+ if (pi->align >= align)
+   break;
+ pi->align = align;

1 2 >

1 - 100 of 107 matches

Mail list logo