[PATCH] Fix i?86 mem += reg + comparison peephole (PR target/52086)

2012-02-02 Thread Jakub Jelinek
Hi!

This peephole, as shown on the testcase, happily transforms a QImode
memory load into a register, followed by SImode addition of that reg and
%ebp, followed by QImode store of that back into the same memory and
QImode comparison of that with zero into a QImode addition of the register
to the memory with setting flags instead of clobbering them.  The problem
with that is that for -m32 %ebp can't be used in QImode instructions.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2012-02-02  Jakub Jelinek  

PR target/52086
* config/i386/i386.md (*addqi_2 peephole with SImode addition): Check
that operands[2] is either immediate, or q_regs_operand.

* gcc.dg/pr52086.c: New test.

--- gcc/config/i386/i386.md.jj  2012-01-13 21:47:34.0 +0100
+++ gcc/config/i386/i386.md 2012-02-01 22:42:57.805296347 +0100
@@ -17232,6 +17232,9 @@ (define_peephole2
&& REG_P (operands[0]) && REG_P (operands[4])
&& REGNO (operands[0]) == REGNO (operands[4])
&& peep2_reg_dead_p (4, operands[0])
+   && (mode != QImode
+   || immediate_operand (operands[2], SImode)
+   || q_regs_operand (operands[2], SImode))
&& !reg_overlap_mentioned_p (operands[0], operands[1])
&& ix86_match_ccmode (peep2_next_insn (3),
 (GET_CODE (operands[3]) == PLUS
--- gcc/testsuite/gcc.dg/pr52086.c.jj   2012-02-01 22:56:54.520519545 +0100
+++ gcc/testsuite/gcc.dg/pr52086.c  2012-02-01 22:56:35.0 +0100
@@ -0,0 +1,21 @@
+/* PR target/52086 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-fpic" { target fpic } } */
+
+struct S { char a; char b[100]; };
+int bar (void);
+int baz (int);
+
+void
+foo (struct S *x)
+{
+  if (bar () & 1)
+{
+  char c = bar ();
+  baz (4);
+  x->a += c;
+  while (x->a)
+   x->b[c] = bar ();
+}
+}

Jakub


Re: [PATCH] Don't ICE in vectorizer when testing if a pattern stmt is used by another pattern stmt (PR tree-optimization/52073)

2012-02-02 Thread Richard Guenther
On Wed, 1 Feb 2012, Jakub Jelinek wrote:

> Hi!
> 
> vinfo_for_stmt can't be used on stmts outside of the current loop.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2012-02-01  Jakub Jelinek  
> 
>   PR tree-optimization/52073
>   * tree-vect-stmts.c (vect_mark_relevant): When checking uses of
>   a pattern stmt for pattern uses, ignore uses outside of the loop.
> 
>   * gcc.c-torture/compile/pr52073.c: New test.
> 
> --- gcc/tree-vect-stmts.c.jj  2012-01-22 16:02:10.0 +0100
> +++ gcc/tree-vect-stmts.c 2012-02-01 10:33:58.847815421 +0100
> @@ -150,6 +150,8 @@ vect_mark_relevant (VEC(gimple,heap) **w
>use_operand_p use_p;
>gimple use_stmt;
>tree lhs;
> +   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> +   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>  
>if (is_gimple_assign (stmt))
>  lhs = gimple_assign_lhs (stmt);
> @@ -166,6 +168,9 @@ vect_mark_relevant (VEC(gimple,heap) **w
> continue;
>   use_stmt = USE_STMT (use_p);
>  
> + if (!flow_bb_inside_loop_p (loop, gimple_bb (use_stmt)))
> +   continue;
> +
>   if (vinfo_for_stmt (use_stmt)
>   && STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (use_stmt)))
> {
> --- gcc/testsuite/gcc.c-torture/compile/pr52073.c.jj  2012-02-01 
> 10:39:13.041003562 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr52073.c 2012-02-01 
> 10:38:51.0 +0100
> @@ -0,0 +1,28 @@
> +/* PR tree-optimization/52073 */
> +
> +int a, b, c, d, e, f;
> +
> +void
> +foo (int x)
> +{
> +  e = 1;
> +  for (;;)
> +{
> +  int g = c;
> +  if (x)
> + {
> +   if (e)
> + continue;
> +   while (a)
> + --f;
> + }
> +  else
> + for (b = 5; b; b--)
> +   {
> + d = g;
> + g = 0 == d;
> +   }
> +  if (!g)
> + x = 0;
> +}
> +}
> 
>   Jakub
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH] Fix i?86 mem += reg + comparison peephole (PR target/52086)

2012-02-02 Thread Uros Bizjak
On Thu, Feb 2, 2012 at 9:15 AM, Jakub Jelinek  wrote:

> This peephole, as shown on the testcase, happily transforms a QImode
> memory load into a register, followed by SImode addition of that reg and
> %ebp, followed by QImode store of that back into the same memory and
> QImode comparison of that with zero into a QImode addition of the register
> to the memory with setting flags instead of clobbering them.  The problem
> with that is that for -m32 %ebp can't be used in QImode instructions.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?
>
> 2012-02-02  Jakub Jelinek  
>
>        PR target/52086
>        * config/i386/i386.md (*addqi_2 peephole with SImode addition): Check
>        that operands[2] is either immediate, or q_regs_operand.
>
>        * gcc.dg/pr52086.c: New test.

OK, probably we also need to backport this fix to other release branches.

Thanks,
Uros.


Re: [PATCH] Fix i?86 mem += reg + comparison peephole (PR target/52086)

2012-02-02 Thread Jakub Jelinek
On Thu, Feb 02, 2012 at 10:02:09AM +0100, Uros Bizjak wrote:
> OK, 

Thanks.

> probably we also need to backport this fix to other release branches.

No, while I didn't remember it, svn blame revealed that these peepholes
were my fault, added for PR49095 just for 4.7+.

Jakub


Re: [trans-mem, PATCH] do not dereference node if null in expand_call_tm (PR middle-end/52047)

2012-02-02 Thread Richard Guenther
On Wed, Feb 1, 2012 at 10:19 PM, Patrick Marlier
 wrote:
> On 02/01/2012 03:59 AM, Richard Guenther wrote:
>>
>> The patch looks ok, but I'm not sure why you do not get a cgraph node
>> here - cgraph doesn't really care for builtins as far as I can see.
>>  Honza?
>
> I cannot help on this...
>
>
>> Don't you maybe want to handle builtins in a special way here?
>
> Indeed, I think my patch is wrong. __builtin_prefetch should have the
> transaction_pure attribute. I don't know how usually it should be done but
> what about adding a gcc_assert before to dereference node (potentially
> NULL)?
>
> How the attached patch looks like now?
> (Tested on i686)
>
>
>> At least those that are const/pure?
> About const/pure, we cannot consider those functions as transaction_pure
> because they can read global and shared variable.

Well, const functions cannot access global memory, they can only inspect
their arguments.

Of course __builtin_prefetch seems to be special in some way.  Note that
users can explicitely call it, so setting the attribute from the prefetching
pass isn't the correct thing to do.

Note that __builtin_prefetch has the 'no vops' attribute - I think you should
simply consider all 'no vops' builtins as transaction pure instead or
explicitely
consider a set of builtins as transaction pure (that's more scalable than
sticking the attribute onto random builtins - see how we handle builtins in
the alias machinery, we avoid sticking the fnspec attribute onto each
builtin but simply have special handling for them).

Thanks,
Richard.

> BTW, I will post a PR (and probably a patch) about this.
>
> Thanks for your comment!
>
> Patrick.
>
>        PR middle-end/52047
>        * trans-mem.c (expand_call_tm): Add an assertion.
>        * tree-ssa-loop-prefetch.c (issue_prefetch_ref): Add transaction_pure
>        attribute to __builtin_prefetch.
>        (tree_ssa_prefetch_arrays): Likewise.


[patch, libgo] define TIOCNOTTY and TIOCSCTTY constants for sparc-linux-gnu

2012-02-02 Thread Matthias Klose

on sparc-linux, the TIOCNOTTY and TIOCSCTTY constants are defined as

#define TIOCNOTTY _IO('t', 113)

which cannot be parsed by mksysinfo.sh. Just define these as TIOCGWINSZ is 
defined.  This lets the libgo build succeed, but I see the same failures as 
reported in PR52084 for powerpc-linux-gnu.


  Matthias


Index: libgo/mksysinfo.sh
===
--- libgo/mksysinfo.sh  (revision 183830)
+++ libgo/mksysinfo.sh  (working copy)
@@ -101,6 +101,12 @@
 #ifdef TIOCGWINSZ
   TIOCGWINSZ_val = TIOCGWINSZ,
 #endif
+#ifdef TIOCNOTTY
+  TIOCNOTTY_val = TIOCNOTTY,
+#endif
+#ifdef TIOCSCTTY
+  TIOCSCTTY_val = TIOCSCTTY,
+#endif
 };
 EOF
 
@@ -615,6 +621,16 @@
 echo 'const TIOCGWINSZ = _TIOCGWINSZ_val' >> ${OUT}
   fi
 fi
+if ! grep '^const TIOCNOTTY' ${OUT} >/dev/null 2>&1; then
+  if grep '^const _TIOCNOTTY_val' ${OUT} >/dev/null 2>&1; then
+echo 'const TIOCNOTTY = _TIOCNOTTY_val' >> ${OUT}
+  fi
+fi
+if ! grep '^const TIOCSCTTY' ${OUT} >/dev/null 2>&1; then
+  if grep '^const _TIOCSCTTY_val' ${OUT} >/dev/null 2>&1; then
+echo 'const TIOCSCTTY = _TIOCSCTTY_val' >> ${OUT}
+  fi
+fi
 
 # The ioctl flags for terminal control
 grep '^const _TC[GS]ET' gen-sysinfo.go | \


[PATCH] Fix for PR52081 - Missed tail merging with pure calls

2012-02-02 Thread Tom de Vries
Richard,

this patch fixes PR52801.

Consider test-case pr51879-12.c:
...
__attribute__((pure)) int bar (int);
__attribute__((pure)) int bar2 (int);
void baz (int);

int x, z;

void
foo (int y)
{
  int a = 0;
  if (y == 6)
{
  a += bar (7);
  a += bar2 (6);
}
  else
{
  a += bar2 (6);
  a += bar (7);
}
  baz (a);
}
...

When compiling at -O2, pr51879-12.c.094t.pre looks like this:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # VUSE <.MEMD.1722_12(D)>
  # USE = nonlocal escaped
  D.1717_4 = barD.1703 (7);
  # VUSE <.MEMD.1722_12(D)>
  # USE = nonlocal escaped
  D.1718_6 = bar2D.1705 (6);
  aD.1713_7 = D.1717_4 + D.1718_6;
  goto ;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # VUSE <.MEMD.1722_12(D)>
  # USE = nonlocal escaped
  D.1720_8 = bar2D.1705 (6);
  # VUSE <.MEMD.1722_12(D)>
  # USE = nonlocal escaped
  D.1721_10 = barD.1703 (7);
  aD.1713_11 = D.1720_8 + D.1721_10;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1713_1 = PHI 
  # .MEMD.1722_13 = VDEF <.MEMD.1722_12(D)>
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1707 (aD.1713_1);
  # VUSE <.MEMD.1722_13>
  return;
...
block 3 and 4 can be tail-merged.

Value numbering numbers the two phi arguments a_7 and a_11 the same so the
problem is not in value numbering:
...
Setting value number of a_11 to a_7 (changed)
...

There are 2 reasons that tail_merge_optimize doesn't optimize this:

1.
The clause
  is_gimple_assign (stmt) && local_def (gimple_get_lhs (stmt))
  && !gimple_has_side_effects (stmt)
used in both same_succ_hash and gsi_advance_bw_nondebug_nonlocal evaluates to
false for pure calls.
This is fixed by replacing is_gimple_assign with gimple_has_lhs.

2.
In same_succ_equal we check gimples from the 2 bbs side-by-side:
...
  gsi1 = gsi_start_nondebug_bb (bb1);
  gsi2 = gsi_start_nondebug_bb (bb2);
  while (!(gsi_end_p (gsi1) || gsi_end_p (gsi2)))
{
  s1 = gsi_stmt (gsi1);
  s2 = gsi_stmt (gsi2);
  if (gimple_code (s1) != gimple_code (s2))
return 0;
  if (is_gimple_call (s1) && !gimple_call_same_target_p (s1, s2))
return 0;
  gsi_next_nondebug (&gsi1);
  gsi_next_nondebug (&gsi2);
}
...
and we'll be comparing 'bar (7)' and 'bar2 (6)', and gimple_call_same_target_p
will return false.
This is fixed by ignoring local defs in this comparison, by using
gsi_advance_fw_nondebug_nonlocal on the iterators.

bootstrapped and reg-tested on x86_64.

ok for stage1?

Thanks,
- Tom

2012-02-02  Tom de Vries  

* tree-ssa-tail-merge.c (local_def): Move up.
(stmt_local_def): New function, factored out of same_succ_hash.  Use
gimple_has_lhs instead of is_gimple_assign.
(gsi_advance_nondebug_nonlocal): New function, factored out of
gsi_advance_bw_nondebug_nonlocal.  Use stmt_local_def.
(gsi_advance_fw_nondebug_nonlocal): New function.
(gsi_advance_bw_nondebug_nonlocal): Use gsi_advance_nondebug_nonlocal.
Move up.
(same_succ_hash): Use stmt_local_def.
(same_succ_equal): Use gsi_advance_fw_nondebug_nonlocal.

* gcc.dg/pr51879-12.c: New test.
Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c (revision 183325)
+++ gcc/tree-ssa-tail-merge.c (working copy)
@@ -269,6 +269,88 @@ struct aux_bb_info
 #define BB_VOP_AT_EXIT(bb) (((struct aux_bb_info *)bb->aux)->vop_at_exit)
 #define BB_DEP_BB(bb) (((struct aux_bb_info *)bb->aux)->dep_bb)
 
+/* Returns whether VAL is used in the same bb as in which it is defined, or
+   in the phi of a successor bb.  */
+
+static bool
+local_def (tree val)
+{
+  gimple stmt, def_stmt;
+  basic_block bb, def_bb;
+  imm_use_iterator iter;
+  bool res;
+
+  if (TREE_CODE (val) != SSA_NAME)
+return false;
+  def_stmt = SSA_NAME_DEF_STMT (val);
+  def_bb = gimple_bb (def_stmt);
+
+  res = true;
+  FOR_EACH_IMM_USE_STMT (stmt, iter, val)
+{
+  bb = gimple_bb (stmt);
+  if (bb == def_bb)
+	continue;
+  if (gimple_code (stmt) == GIMPLE_PHI
+	  && find_edge (def_bb, bb))
+	continue;
+  res = false;
+  BREAK_FROM_IMM_USE_STMT (iter);
+}
+  return res;
+}
+
+/* Returns true if the only effect a statement STMT has, is too define a locally
+   used SSA_NAME.  */
+
+static bool
+stmt_local_def (gimple stmt)
+{
+  if (gimple_has_lhs (stmt)
+  && local_def (gimple_get_lhs (stmt))
+  && !gimple_has_side_effects (stmt))
+return true;
+
+  return false;
+} 
+
+/* Let GSI skip forwards over local defs, forward if FORWARD or backwards.  */
+
+static void
+gsi_advance_nondebug_nonlocal (gimple_stmt_iterator *gsi, bool forward)
+{
+  gimple stmt;
+
+  while (true)
+{
+  if (gsi_end_p (*gsi))
+	return;
+  stmt = gsi_stmt (*gsi);
+  if (!stmt_local_def (stmt))
+	return;
+  if (forward)
+	gsi_next_nondeb

Re: [PATCH][RFC] Fix PR48124

2012-02-02 Thread Richard Guenther
On Wed, 1 Feb 2012, Richard Guenther wrote:

> 
> This fixes PR48124 where a bitfield store leaks into adjacent
> decls if the decl we store to has bigger alignment than what
> its type requires (thus there is another decl in that "padding").

[...]

> Bootstraped on x86_64-unknown-linux-gnu, testing in progress.

So testing didn't go too well (which makes me suspicious about
the adjust_address the strict-volatile-bitfield path does ...).

The following patch instead tries to make us honor mode1 as
maximum sized mode to be used for accesses to the bitfield
(mode1 as that returned from get_inner_reference, so in theory
that would cover the strict-volatile-bitfield cases already).

It does so by passing down that mode to store_fixed_bit_field
and use it as max-mode argument to get_best_mode there.  The
patch also checks that the HAVE_insv paths would not use
bigger stores than that mode (hopefully I've covered all cases
where we would do that).

Currently all bitfields (that are not volatile) get VOIDmode
from get_inner_reference and the patch tries hard to not
change things in that case.  The expr.c hunk contains one
possible fix for 48124 by computing mode1 based on MEM_SIZE
(probably not the best way - any good ideas welcome).

The patch should also open up the way for fixing PR52080
(that bitfield-store-clobbers-adjacent-fields bug ...)
by simply making get_inner_reference go the
strict-volatile-bitfield path for all bitfield accesses
(and possibly even the !DECL_BIT_FIELD pieces of it).
Thus, do what people^WLinus would expect for "sane" C
(thus non-mixed base-type bitfields).

I'm looking for comments on both pieces of the patch
(is the strict-volatile-bitfields approach using
adjust-address really correct?  is passing down mode1
as forced maximum-size mode correct, or will it pessimize
code too much?)

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I don't see that we can fix 52080 in full for 4.7 but it
would be nice to at least fix 48124 with something not
too invasive (suggestions for some narrower cases to
adjust mode1 welcome).  Other than MEM_SIZE we could
simply use TYPE_ALIGN if that is less than MEM_ALIGN
and compute a maximum size mode based on that.

Thanks,
Richard.

2012-01-02  Richard Guenther  

* expmed.c (store_bit_field_1): Restrict HAVE_insv handling
to cases where we can honor an access of at most fieldmode size.
Pass down fieldmode to store_fixed_bit_field.
(store_fixed_bit_field): Take maximum mode parameter.  Use
it instead of word_mode as maximum mode passed to get_best_mode.
(store_split_bit_field): Call store_fixed_bit_field with
word_mode as maximum mode.

PR middle-end/48124
* expr.c (expand_assignment): Adjust mode1 for bitfields
to not possibly access tail alignment padding of an object.

* gcc.dg/torture/pr48124-1.c: New testcase.
* gcc.dg/torture/pr48124-2.c: Likewise.
* gcc.dg/torture/pr48124-3.c: Likewise.
* gcc.dg/torture/pr48124-4.c: Likewise.

Index: gcc/expr.c
===
--- gcc/expr.c  (revision 183791)
+++ gcc/expr.c  (working copy)
@@ -4705,6 +4705,17 @@ expand_assignment (tree to, tree from, b
to_rtx = adjust_address (to_rtx, mode1, 0);
  else if (GET_MODE (to_rtx) == VOIDmode)
to_rtx = adjust_address (to_rtx, BLKmode, 0);
+ else if (mode1 == VOIDmode
+  && MEM_SIZE_KNOWN_P (to_rtx))
+   {
+ mode1 = mode_for_size ((MEM_SIZE (to_rtx) & -MEM_SIZE (to_rtx))
+* BITS_PER_UNIT, MODE_INT, 0);
+ if (mode1 == BLKmode
+ || GET_MODE_SIZE (mode1) > GET_MODE_SIZE (word_mode)
+ || (GET_MODE_SIZE (mode1)
+ < GET_MODE_SIZE (DECL_MODE (TREE_OPERAND (to, 1)
+   mode1 = VOIDmode;
+   }
}
  
   if (offset != 0)
Index: gcc/expmed.c
===
--- gcc/expmed.c(revision 183791)
+++ gcc/expmed.c(working copy)
@@ -50,7 +50,7 @@ static void store_fixed_bit_field (rtx,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
-  rtx);
+  rtx, enum machine_mode);
 static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
@@ -632,6 +632,8 @@ store_bit_field_1 (rtx str_rtx, unsigned
   && GET_MODE (value) != BLKmode
   && bitsize > 0
   && GET_MODE_BITSIZE (op_mode) >= bitsize
+  && (fieldmode == VOIDmode
+ || GET_MODE_BITSIZE (fieldmode) >= GET_MODE_BITSIZE (op_mode))
   /* Do not use insv for volatile bitfi

[patch boehm-gc]: Fix PR/48514

2012-02-02 Thread Kai Tietz
Hello,

this patch fixes an issue about current used boehm-gc tree in gcc.
The issue is that _DLL macro is used for Windows-targets wrong.  It is
assumed that it would mean to build a shared object (DLL), but it just
indicates that built DLL/Exectuable is using shared msvcrt.dll
runtime.  (see here msdn about meaning for _DLL macro).

ChangeLog

* include/gc_config_macros.h (GC_DLL): Define it for
mingw-targets only, if
we are actual in boehm-gc's build and DLL_EXPORT macro is defined.

Tested for i686-w64-mingw32, i686-pc-mingw32, and x86_64-w64-mingw32
(including building of libjava with test).  Ok for apply?

Regards,
Kai

Index: include/gc_config_macros.h
===
--- include/gc_config_macros.h  (revision 183833)
+++ include/gc_config_macros.h  (working copy)
@@ -81,7 +81,9 @@
 typedef long ptrdiff_t;/* ptrdiff_t is not defined */
 # endif

-#if defined(_DLL) && !defined(GC_NOT_DLL) && !defined(GC_DLL)
+#if ((defined(_DLL) && !defined (__MINGW32__)) \
+ || (defined (DLL_EXPORT) && defined (GC_BUILD))) \
+&& !defined(GC_DLL)
 # define GC_DLL
 #endif


address review comments (issue5610048)

2012-02-02 Thread Dmitriy Vyukov
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 183833)
+++ gcc/doc/invoke.texi (working copy)
@@ -306,6 +306,7 @@
 -fdump-tree-ssa@r{[}-@var{n}@r{]} -fdump-tree-pre@r{[}-@var{n}@r{]} @gol
 -fdump-tree-ccp@r{[}-@var{n}@r{]} -fdump-tree-dce@r{[}-@var{n}@r{]} @gol
 -fdump-tree-gimple@r{[}-raw@r{]} -fdump-tree-mudflap@r{[}-@var{n}@r{]} @gol
+-fdump-tree-tsan@r{[}-@var{n}@r{]} @gol
 -fdump-tree-dom@r{[}-@var{n}@r{]} @gol
 -fdump-tree-dse@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
@@ -376,8 +377,8 @@
 -floop-parallelize-all -flto -flto-compression-level
 -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
--fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
--fno-default-inline @gol
+-fmove-loop-invariants -fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
+-ftsan -ftsan-ignore -fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
 -fno-inline -fno-math-errno -fno-peephole -fno-peephole2 @gol
 -fno-sched-interblock -fno-sched-spec -fno-signed-zeros @gol
@@ -5812,6 +5813,11 @@
 Dump each function after adding mudflap instrumentation.  The file name is
 made by appending @file{.mudflap} to the source file name.
 
+@item tsan
+@opindex fdump-tree-tsan
+Dump each function after adding ThreadSanitizer instrumentation.  The file 
name is
+made by appending @file{.tsan} to the source file name.
+
 @item sra
 @opindex fdump-tree-sra
 Dump each function after performing scalar replacement of aggregates.  The
@@ -6589,6 +6595,12 @@
 some protection against outright memory corrupting writes, but allows
 erroneously read data to propagate within a program.
 
+@item -ftsan -ftsan-ignore
+@opindex ftsan
+@opindex ftsan-ignore
+Add ThreadSanitizer instrumentation. Use @option{-ftsan-ignore} to specify
+an ignore file. Refer to http://go/tsan for details.
+
 @item -fthread-jumps
 @opindex fthread-jumps
 Perform optimizations where we check to see if a jump branches to a
Index: gcc/tree-tsan.c
===
--- gcc/tree-tsan.c (revision 0)
+++ gcc/tree-tsan.c (revision 0)
@@ -0,0 +1,1109 @@
+/* ThreadSanitizer, a data race detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Dmitry Vyukov 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "intl.h"
+#include "tm.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "function.h"
+#include "tree-flow.h"
+#include "tree-pass.h"
+#include "tree-tsan.h"
+#include "tree-iterator.h"
+#include "cfghooks.h"
+#include "langhooks.h"
+#include "output.h"
+#include "options.h"
+#include "target.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+
+/* ThreadSanitizer is a data race detector for C/C++ programs.
+   http://code.google.com/p/data-race-test/wiki/ThreadSanitizer
+
+   The tool consists of two parts:
+   instrumentation module (this file) and a run-time library.
+   The instrumentation module maintains shadow call stacks
+   and intercepts interesting memory accesses.
+   The instrumentation is enabled with -ftsan flag.
+
+   Instrumentation for shadow stack maintenance is as follows:
+   void somefunc ()
+   {
+ __tsan_shadow_stack [-1] = __builtin_return_address (0);
+ __tsan_shadow_stack++;
+ // function body
+ __tsan_shadow_stack--;
+   }
+
+   Interception for memory access interception is as follows:
+   *addr = 1;
+   __tsan_handle_mop (addr, flags);
+   where flags are (is_sblock | (is_store << 1) | ((sizeof (*addr) - 1) << 2).
+   is_sblock is used merely for optimization purposes and can always
+   be set to 1, see comments in instrument_mops function.
+
+   Ignore files can be used to selectively non instrument some functions.
+   Ignore file is specified with -ftsan-ignore=filename flag.
+   There are 3 types of ignores: (1) do not instrument memory accesses
+   in the function, (2) do not create sblocks in the function
+   and (3) recursively ignore memory accesses in the function.
+   That last ignore type requires additional instrumentation of the form:
+   void somefunc ()
+   {
+ __tsan_thread_ignore++;
+ //

Re: [PATCH][RFC] Fix PR48124

2012-02-02 Thread Richard Guenther
On Thu, 2 Feb 2012, Richard Guenther wrote:

> On Wed, 1 Feb 2012, Richard Guenther wrote:
> 
> > 
> > This fixes PR48124 where a bitfield store leaks into adjacent
> > decls if the decl we store to has bigger alignment than what
> > its type requires (thus there is another decl in that "padding").
> 
> [...]
> 
> > Bootstraped on x86_64-unknown-linux-gnu, testing in progress.
> 
> So testing didn't go too well (which makes me suspicious about
> the adjust_address the strict-volatile-bitfield path does ...).
> 
> The following patch instead tries to make us honor mode1 as
> maximum sized mode to be used for accesses to the bitfield
> (mode1 as that returned from get_inner_reference, so in theory
> that would cover the strict-volatile-bitfield cases already).
> 
> It does so by passing down that mode to store_fixed_bit_field
> and use it as max-mode argument to get_best_mode there.  The
> patch also checks that the HAVE_insv paths would not use
> bigger stores than that mode (hopefully I've covered all cases
> where we would do that).
> 
> Currently all bitfields (that are not volatile) get VOIDmode
> from get_inner_reference and the patch tries hard to not
> change things in that case.  The expr.c hunk contains one
> possible fix for 48124 by computing mode1 based on MEM_SIZE
> (probably not the best way - any good ideas welcome).
> 
> The patch should also open up the way for fixing PR52080
> (that bitfield-store-clobbers-adjacent-fields bug ...)
> by simply making get_inner_reference go the
> strict-volatile-bitfield path for all bitfield accesses
> (and possibly even the !DECL_BIT_FIELD pieces of it).
> Thus, do what people^WLinus would expect for "sane" C
> (thus non-mixed base-type bitfields).
> 
> I'm looking for comments on both pieces of the patch
> (is the strict-volatile-bitfields approach using
> adjust-address really correct?  is passing down mode1
> as forced maximum-size mode correct, or will it pessimize
> code too much?)
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> I don't see that we can fix 52080 in full for 4.7 but it
> would be nice to at least fix 48124 with something not
> too invasive (suggestions for some narrower cases to
> adjust mode1 welcome).  Other than MEM_SIZE we could
> simply use TYPE_ALIGN if that is less than MEM_ALIGN
> and compute a maximum size mode based on that.

The following variant also bootstrapped and tested ok.

Richard.


Index: gcc/expr.c
===
--- gcc/expr.c  (revision 183833)
+++ gcc/expr.c  (working copy)
@@ -4705,6 +4705,23 @@ expand_assignment (tree to, tree from, b
to_rtx = adjust_address (to_rtx, mode1, 0);
  else if (GET_MODE (to_rtx) == VOIDmode)
to_rtx = adjust_address (to_rtx, BLKmode, 0);
+ /* If we have a bitfield access and the alignment of the
+accessed object is larger than what its type would require
+restrict the mode we use for accesses to avoid touching
+the tail alignment-padding.  See PR48124.  */
+ else if (mode1 == VOIDmode
+  && TREE_CODE (to) == COMPONENT_REF
+  && TYPE_ALIGN (TREE_TYPE (tem)) < MEM_ALIGN (to_rtx))
+   {
+ mode1 = mode_for_size (TYPE_ALIGN (TREE_TYPE (tem)), 
MODE_INT, 1);
+ if (mode1 == BLKmode
+ /* Not larger than word_mode.  */
+ || GET_MODE_SIZE (mode1) > GET_MODE_SIZE (word_mode)
+ /* Nor smaller than the fields mode.  */
+ || (GET_MODE_SIZE (mode1)
+ < GET_MODE_SIZE (DECL_MODE (TREE_OPERAND (to, 1)
+   mode1 = VOIDmode;
+   }
}
  
   if (offset != 0)



Cyclis aliases hangs GCC

2012-02-02 Thread Jan Hubicka
Hi,
this patch adds logic to detect cycles and cgraph and varpool aliases.
Prevoiusly we just output them into asm file and let gas to complain, but now 
we catch ourselves
in infinite loop.  I assumed that this is already tested in varasm but it is 
not.

Bootstrapped/regtested x86_64-linux, will commit it shortly.
Jan Hubicka  
Tom de Vries  
PR middle-end/51998
* cgraphunit.c (cgraph_analyze_function): Break cyclic aliases.
* varpool.c (varpool_analyze_pending_decls): Likewise.

* testsuite/gcc.dg/alias-12.c: New testcase.
* testsuite/gcc.dg/alias-13.c: New testcase.
Index: cgraphunit.c
===
--- cgraphunit.c(revision 183757)
+++ cgraphunit.c(working copy)
@@ -836,6 +836,16 @@ cgraph_analyze_function (struct cgraph_n
   if (node->alias && node->thunk.alias)
 {
   struct cgraph_node *tgt = cgraph_get_node (node->thunk.alias);
+  struct cgraph_node *n;
+
+  for (n = tgt; n && n->alias;
+  n = n->analyzed ? cgraph_alias_aliased_node (n) : NULL)
+   if (n == node)
+ {
+   error ("function %q+D part of alias cycle", node->decl);
+   node->alias = false;
+   return;
+ }
   if (!VEC_length (ipa_ref_t, node->ref_list.references))
 ipa_record_reference (node, NULL, tgt, NULL, IPA_REF_ALIAS, NULL);
   if (node->same_body_alias)
Index: testsuite/gcc.dg/alias-12.c
===
--- testsuite/gcc.dg/alias-12.c (revision 0)
+++ testsuite/gcc.dg/alias-12.c (revision 0)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-alias "" } */
+/* { dg-options "-O2" } */
+static void f (void) __attribute__((alias("f"))); // { dg-error "part of alias 
cycle" "" }
+
+void g ()
+{
+  f ();
+}
Index: testsuite/gcc.dg/alias-13.c
===
--- testsuite/gcc.dg/alias-13.c (revision 0)
+++ testsuite/gcc.dg/alias-13.c (revision 0)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-alias "" } */
+/* { dg-options "-O2" } */
+static void f (void) __attribute__((alias("g"))); static void g (void) 
__attribute__((alias("f"))); // { dg-error "part of alias cycle" "" }
+
+void h ()
+{
+  f ();
+}
Index: varpool.c
===
--- varpool.c   (revision 183757)
+++ varpool.c   (working copy)
@@ -477,6 +477,16 @@ varpool_analyze_pending_decls (void)
   if (node->alias && node->alias_of)
{
  struct varpool_node *tgt = varpool_node (node->alias_of);
+  struct varpool_node *n;
+
+ for (n = tgt; n && n->alias;
+  n = n->analyzed ? varpool_alias_aliased_node (n) : NULL)
+   if (n == node)
+ {
+   error ("variable %q+D part of alias cycle", node->decl);
+   node->alias = false;
+   continue;
+ }
  if (!VEC_length (ipa_ref_t, node->ref_list.references))
ipa_record_reference (NULL, node, NULL, tgt, IPA_REF_ALIAS, NULL);
  /* C++ FE sometimes change linkage flags after producing same body 
aliases.  */


Re: [google] Backport ThreadSanitizer instrumentation pass from google/main to google/gcc-4_6 (issue 5610048)

2012-02-02 Thread dvyukov

On 2012/02/01 18:05:15, Diego Novillo wrote:

On 2/1/12 3:16 AM, Dmitriy Vyukov wrote:
> This is for google/gcc-4_6 branch.
> Backport ThreadSanitizer (tsan) instrumentation pass from

google/main.


Have you used the validator script to test this patch?  Your patch
should not affect regular builds, but you want to make sure that the
tsan tests don't produce failures in any configuration.


Hi Diego,

Yes, I've tested it with the script with crosstool tester.

PTAL, I've addressed all comments except 2:


> +
> +#include
> +#include



These includes are not needed.  They are handled by system.h (GCC

defers

all/most system includes to these files).




> +  /* Check if a user has defined it for testing.  */
> +  id = get_identifier (name);
> +  var = varpool_node_for_asm (id);
> +  if (var != NULL)
> +{
> +  decl = var->decl;
> +  gcc_assert (TREE_CODE (decl) == VAR_DECL);
> +  return decl;
> +}
> +
> +  decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, id, typ);



It would be slightly more useful to give this decl a known location.

How

about passing it in from the caller?  DECL_SOURCE_LOCATION
(current_function_decl) may work when you have no other location (like

a

statement or another symbol).



At worst, you pass UNKNOWN_LOCATION when you don't even have that in

the

caller.


Here I create a declaration for a var that is defined in our run-time
library. If I use some real location, then the declaration will have
different irrelevant locations in each TU (irrelevant, because it will
be somewhere near begin of a first instrumented function), + there will
be the definition with a correct location (inside of our run-time
library). Does it make sense? The current situation (a lot of
declarations with unknown location + definition with correct location)
looks OK IMVHO.




> +get_thread_ignore_decl (void)
> +{
> +  static tree decl;
> +
> +  if (decl == NULL)
> +decl = build_var_decl (integer_type_node, TSAN_IGNORE);
> +  return decl;
> +}
> +
> +/* Returns a definition of a runtime functione with type TYP and

name NAME.

*/



s/functione/function/



> +  static tree decl;
> +
> +  if (decl != NULL)
> +return decl;
> +
> +  typ = build_function_type_list (void_type_node, NULL_TREE);
> +  decl = build_func_decl (typ, TSAN_INIT);
> +  return decl;
> +}
> +
> +/* Adds new ignore definition to the global list.
> +   TYPE is the ignore type (see tsan_ignore_type).



The code uses TYPE and TYP.  Please change to always use TYPE.



> +}
> +
> +/* Checks as to whether identifier STR matches template TEMPL.
> +   Templates can only contain '*', e.g. 'std*string*insert'.
> +   Templates implicitly start and end with '*'
> +   since they are matched against mangled names.
> +   Returns non-zero if STR is matched against TEMPL.  */
> +
> +static int



Use 'bool' instead of 'int'.



> +  char buf [PATH_MAX];
> +
> +  if(getenv("GCCTSAN_PAUSE"))
> +{
> +  int res;
> +  printf("ATTACH A DEBUGGER AND PRESS ENTER\n");
> +  res = scanf("%s", buf);
> +  (void)res;
> +}



No debugging getenv(), please.  Either use a flag or remove this code.



> +}
> +}
> +  if (f == NULL)
> +{
> +  error ("failed to open ignore file '%s'\n",

flag_tsan_ignore);


This should be a fatal_error, not error (error is for problems with

the

input code, not the flags).



> +  if (do_dec)
> +{
> +  size_val = -size_val;
> +  size_valhi = -1;
> +}
> +  op_size = build_int_cst_wide (sizetype, size_val, size_valhi);
> +  sstack_decl = get_shadow_stack_decl ();
> +  op_expr = build2 (POINTER_PLUS_EXPR, ptr_type_node, sstack_decl,

op_size);


fold_build2_loc()  (likewise in other places).  Get the location from
the sequence you are given.


I set location for each inserted gimple separately in:
static void
set_location (gimple_seq seq, location_t loc)
{
  gimple_seq_node n;

  for (n = gimple_seq_first (seq); n != NULL; n = n->next)
gimple_set_location (n->stmt, loc);
}
It works post factum - I obtain callers pc from the runtime function,
and then obtain debug info the pc, I never saw any bad precedents.

Does it make sense to set location for all tree's? I can't extract
location from the sequence, because it's NULL (however, of course, I
still can pass all the way down).





> +  addr_expr = force_gimple_operand (expr_ptr, gseq, true,

NULL_TREE);

> +  expr_type = TREE_TYPE (expr);
> +  while (TREE_CODE (expr_type) == ARRAY_TYPE)
> +expr_type = TREE_TYPE (expr_type);
> +  expr_size = TYPE_SIZE (expr_type);
> +  size = tree_to_double_int (expr_size);
> +  gcc_assert (size.high == 0&&  size.low != 0);
> +  if (size.low>  128)
> +size.low = 128;



Could you add a comment here?  I'm not sure what this 128 means.



> +/* Checks as to whether EXPR refers to constant var/field/param.
> +   Don't bother to instrument them.  */
> +
> +static int
> +is_load_of_const (tree expr, int is_store)
> +{



s/int/bool/



> +static void
> +handle_gimpl

[Patch, fortran] Fix ICE with class array references.

2012-02-02 Thread Mikael Morin

Hello,

this patch, extracted with some modifications from PR50981 comment #28
[http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50981#c28],
(which has accumulated a lot of things) fixes an ICE noticed in several
PRs with an error like:

internal compiler error: in gfc_conv_descriptor_data_get,
at fortran/trans-array.c:147

internal compiler error: in gfc_conv_descriptor_offset, at
fortran/trans-array.c:210

the problem is a missing "_data" reference (to escape the class 
container) when trying to access a subobject of a class object.


The solution proposed is to replace the call to 
gfc_add_component_ref(expr, "_data") with a call to a new, more general, 
function gfc_fix_class_refs which takes care of adding the "_data" 
component in all references (not only the last one) where it is missing.

Thus, it works
- in the scalar case: class%array_comp(1), class%scalar_comp
- with multiple level of components: class%comp%subclass%sub_comp
- in the array case (but this was working before): class%array_comp(:)
- in any mix of the above cases.

I have chosen to make it a separate function instead of fixing 
gfc_add_component_ref, so that it can be reused later (maybe...) even if 
we don't want to add a "_data", or "_vptr" or ... component.



W.R.T. the code itself, I think it is rather straightforward.  There is 
an odd thing to prevent a regression in class_41.f03.  See the big 
comment in class_data_ref_missing.



Regression tested on x86_64-unknown-freebsd9.0.  OK for trunk?

Mikael



2012-02-02  Mikael Morin  

PR fortran/41587
PR fortran/46356
PR fortran/51754
PR fortran/50981
* class.c (insert_component_ref, class_data_ref_missing,
gfc_fix_class_refs): New functions.
* gfortran.h (gfc_fix_class_refs): New prototype.
* trans-expr.c (gfc_conv_expr): Remove special case handling and call
gfc_fix_class_refs instead.

diff --git a/class.c b/class.c
index 52c5a61..24e06d2 100644
--- a/class.c
+++ b/class.c
@@ -52,6 +52,129 @@ along with GCC; see the file COPYING3.  If not see
 #include "constructor.h"
 
 
+/* Inserts a derived type component reference in a data reference chain.
+TS: base type of the ref chain so far, in which we will pick the component
+REF: the address of the GFC_REF pointer to update
+NAME: name of the component to insert
+   Note that component insertion makes sense only if we are at the end of
+   the chain (*REF == NULL) or if we are adding a missing "_data" component
+   to access the actual contents of a class object.  */
+
+static void
+insert_component_ref (gfc_typespec *ts, gfc_ref **ref, const char * const name)
+{
+  gfc_symbol *type_sym;
+  gfc_ref *new_ref;
+
+  gcc_assert (ts->type == BT_DERIVED || ts->type == BT_CLASS);
+  type_sym = ts->u.derived;
+
+  new_ref = gfc_get_ref ();
+  new_ref->type = REF_COMPONENT;
+  new_ref->next = *ref;
+  new_ref->u.c.sym = type_sym;
+  new_ref->u.c.component = gfc_find_component (type_sym, name, true, true);
+  gcc_assert (new_ref->u.c.component);
+
+  if (new_ref->next)
+{
+  gfc_ref *next = NULL;
+
+  /* We need to update the base type in the trailing reference chain to
+that of the new component.  */
+
+  gcc_assert (strcmp (name, "_data") == 0);
+
+  if (new_ref->next->type == REF_COMPONENT)
+   next = new_ref->next;
+  else if (new_ref->next->type == REF_ARRAY
+  && new_ref->next->next
+  && new_ref->next->next->type == REF_COMPONENT)
+   next = new_ref->next->next;
+
+  if (next != NULL)
+   {
+ gcc_assert (new_ref->u.c.component->ts.type == BT_CLASS
+ || new_ref->u.c.component->ts.type == BT_DERIVED);
+ next->u.c.sym = new_ref->u.c.component->ts.u.derived;
+   }
+}
+
+  *ref = new_ref;
+}
+
+
+/* Tells whether we need to add a "_data" reference to access REF subobject
+   from an object of type TS.  If FIRST_REF_IN_CHAIN is set, then the base
+   object accessed by REF is a variable; in other words it is a full object,
+   not a subobject.  */
+
+static bool
+class_data_ref_missing (gfc_typespec *ts, gfc_ref *ref, bool 
first_ref_in_chain)
+{
+  /* Only class containers may need the "_data" reference.  */
+  if (ts->type != BT_CLASS)
+return false;
+
+  /* Accessing a class container with an array reference is certainly wrong.  
*/
+  if (ref->type != REF_COMPONENT)
+return true;
+
+  /* Accessing the class container's fields is fine.  */
+  if (ref->u.c.component->name[0] == '_')
+return false;
+
+  /* At this point we have a class container with a non class container's field
+ component reference.  We don't want to add the "_data" component if we are
+ at the first reference and the symbol's type is an extended derived type.
+ In that case, conv_parent_component_references will do the right thing so
+ it is not absolutely necessary.  Omitting it prevents a regression (see
+ class_41.f03) in th

Re: [Patch, fortran] Fix ICE with class array references.

2012-02-02 Thread Paul Richard Thomas
Dear Mikael,

This...

> I have chosen to make it a separate function instead of fixing
> gfc_add_component_ref, so that it can be reused later (maybe...) even if we
> don't want to add a "_data", or "_vptr" or ... component.

...is exactly what I had a mind to do, once clear of regression
fixing.  Indeed, it is exactly in the spirit of the comment that you
have now eliminated in trans-expr.c

> W.R.T. the code itself, I think it is rather straightforward.  There is an
> odd thing to prevent a regression in class_41.f03.  See the big comment in
> class_data_ref_missing.
>
>
> Regression tested on x86_64-unknown-freebsd9.0.  OK for trunk?

Yes, indeed! OK for trunk.

Many thanks for the patch.  This makes for a rather complete
implementation of OOP.  I guess that the next step should be the
implementation of unlimited polymorphic objects.

Cheers

Paul


Re: [wwwdocs] Add section on diagnostics conventions

2012-02-02 Thread Gabriel Dos Reis
On Wed, Feb 1, 2012 at 9:07 PM, Diego Novillo  wrote:
>
> Thanks.  I've incorporated your feedback in the attached patch.  OK for
> wwwdocs?
yes; thanks!

-- Gaby


Re: [google] Backport ThreadSanitizer instrumentation pass from google/main to google/gcc-4_6 (issue 5610048)

2012-02-02 Thread Diego Novillo
On Thu, Feb 2, 2012 at 06:01,   wrote:

> Here I create a declaration for a var that is defined in our run-time
> library. If I use some real location, then the declaration will have
> different irrelevant locations in each TU (irrelevant, because it will
> be somewhere near begin of a first instrumented function), + there will
> be the definition with a correct location (inside of our run-time
> library). Does it make sense? The current situation (a lot of
> declarations with unknown location + definition with correct location)
> looks OK IMVHO.

Ah, OK.  In that case, that's fine.

> Does it make sense to set location for all tree's? I can't extract
> location from the sequence, because it's NULL (however, of course, I
> still can pass all the way down).

Just the statements.  I had missed the location setting you do at the
end.  That is fine.


Thanks.  Diego.


Re: Too much memory in chan/select2.go used

2012-02-02 Thread Uros Bizjak
On Wed, Feb 1, 2012 at 10:59 PM, Uros Bizjak  wrote:
> On Wed, Feb 1, 2012 at 10:27 PM, Ian Lance Taylor  wrote:
>
>>> (BTW: Do you have any idea on what to do with excessive memory usage
>>> in chan/select2.go? )
>>
>> At this point I don't.  It's sort of peculiar.  Sending an int on a
>> channel should not use any memory.  The test is careful to only measure
>> the memory allocated for sending and receiving, and as far as I can see
>> nothing else should be allocated during the test.  You reported that the
>> test was allocating 2098576 bytes.  When I run it I see it allocating
>> 1408 bytes on x86_64, 640 bytes on i386.  2098576 is much larger than
>> either number.  What is allocating that memory?
>>
>> In other words, there appears to be a real bug here.  You can probably
>> track it down by setting a breakpoint on runtime_mallocgc after the line
>>        runtime.MemStats.Alloc = 0
>> What is calling runtime_mallocgc?

Some more debugging reveal the difference between alpha and x86_64.
Alpha does not implement split stacks, so soon after
"runtime.MemStats.Alloc = 0" line, we allocate exactly 2MB fake stack
via:

Breakpoint 5, runtime_mallocgc (size=2097152, flag=6, dogc=0,
zeroed=0) at ../../../gcc-svn/trunk/libgo/runtime/malloc.goc:41
41  m = runtime_m();
(gdb) bt
#0  runtime_mallocgc (size=2097152, flag=6, dogc=0, zeroed=0) at
../../../gcc-svn/trunk/libgo/runtime/malloc.goc:41
#1  0x0250d4b0 in runtime_malg (stacksize=,
ret_stack=0xf840205f70, ret_stacksize=0xf840205f68)
at ../../../gcc-svn/trunk/libgo/runtime/proc.c:1166
#2  0x0250e3b8 in __go_go (fn=0x1200016b0 ,
arg=0xf841f0) at ../../../gcc-svn/trunk/libgo/runtime/proc.c:1218
#3  0x000120001968 in main.main () at select2.go:46
#4  0x0250e800 in runtime_main () at
../../../gcc-svn/trunk/libgo/runtime/proc.c:371

So, short of XFAILing the test on non-split stack targets, I have no
other idea how to handle this testcase failure.

Uros.


Re: Out-of-order update of new_spill_reg_store[]

2012-02-02 Thread Ulrich Weigand
Richard Sandiford wrote:
> "Ulrich Weigand"  writes:
> > Richard Sandiford wrote:
> >>  Either way, the idea is that new_spill_reg_store[R] is only valid
> >>  for reload registers R that reach the end of the reload sequence.
> >>  We really should check that H1 reaches the end before using
> >>  new_spill_reg_store[H1].
> >
> > I'm a bit confused why it is necessary to check that the register reaches
> > the end of the reload sequence *both* when *setting* new_spill_reg_store,
> > and again when *using* the contents of new_spill_reg_store ...  Wouldn't
> > the latter be redundant if we already guarantee the former?
> 
> The idea was to handle the case where we use the same register for
> two reloads.  One of them reaches the end but the other one doesn't.

Ah, now I get it.  Thanks for the explanation!

> > In any case, it seems strange to use some random SET_SRC register for
> > spill_reg_store purposes.  I'd have expected this array to only track
> > spill registers.  In fact, the comment says:
> >   /* If this is an optional reload, try to find the source reg
> >  from an input reload.  */
> > So at least the comment seems to indicate that only case [B] was ever
> > intended to be handled here, not case [A].
> 
> In that case though, I don't really see what the src_reg assignment:
> 
> if (set && SET_DEST (set) == rld[r].out)
>   {
> int k;
> 
> src_reg = SET_SRC (set);
> 
> is doing, since it is only used if we can't find an input reload.

Oh, I agree -- I was just pointing out that the code doesn't match the
comment here.  I'd assume the intent was to catch even more options for
reload inheritance; I'm just not really sure doing it this way is safe.
(I'm also not sure how much this actually buys us; it would be interesting
to experiment with disabling this code path and see what if any differences
in code generation on various platforms result ...)

> > Maybe the conservative fix would be to not handle case [A], and neither
> > case [B] when the register does not reach the end.  Not sure if this
> > would show up as performance regression somewhere, but it seems unlikely
> > to me.
> 
> I'd hope so too, but since that first src_reg was presumably added for a
> reason, I'm a bit reluctant to do something so daring at this stage. :-)

Agreed, this looks like something for later.

> block.  Since that part wasn't really directly related to the bug
> I was trying to fix, more of a misguided attempt to be complete,
> my preference would be to go for the second patch, which leaves
> the block untouched.

Makes sense to me.

I'd still like to give Bernd a chance to weigh in (since he already looked
at the patch before), but if he doesn't within the next couple of days,
your (second) patch is OK for mainline.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [Patch, fortran] PR52012 - [4.6/4.7 Regression] Wrong-code with realloc on assignment and RESHAPE w/ ORDER=

2012-02-02 Thread Paul Richard Thomas
Dear Tobias,

Following our exchanges with Dominique, I think that the attached
patch will have to do for now.

Bootstrapped and regtested on FC9/x86_64 - OK for trunk?

Cheers

Paul

2012-02-02  Paul Thomas  

PR fortran/52012
* trans-expr.c (fcncall_realloc_result): If variable shape is
correct, retain the bounds, whatever they are.

2012-02-02  Paul Thomas  

PR fortran/52012
* gfortran.dg/realloc_on_assign_11.f90: New test.
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c	(revision 183757)
--- gcc/fortran/trans-expr.c	(working copy)
*** realloc_lhs_loop_for_fcn_call (gfc_se *s
*** 6276,6282 
  }
  
  
! /* For Assignment to a reallocatable lhs from intrinsic functions,
 replace the se.expr (ie. the result) with a temporary descriptor.
 Null the data field so that the library allocates space for the
 result. Free the data of the original descriptor after the function,
--- 6276,6282 
  }
  
  
! /* For assignment to a reallocatable lhs from intrinsic functions,
 replace the se.expr (ie. the result) with a temporary descriptor.
 Null the data field so that the library allocates space for the
 result. Free the data of the original descriptor after the function,
*** fcncall_realloc_result (gfc_se *se, int 
*** 6290,6333 
tree res_desc;
tree tmp;
tree offset;
int n;
  
/* Use the allocation done by the library.  Substitute the lhs
   descriptor with a copy, whose data field is nulled.*/
desc = build_fold_indirect_ref_loc (input_location, se->expr);
/* Unallocated, the descriptor does not have a dtype.  */
tmp = gfc_conv_descriptor_dtype (desc);
gfc_add_modify (&se->pre, tmp, gfc_get_dtype (TREE_TYPE (desc)));
res_desc = gfc_evaluate_now (desc, &se->pre);
gfc_conv_descriptor_data_set (&se->pre, res_desc, null_pointer_node);
se->expr = gfc_build_addr_expr (TREE_TYPE (se->expr), res_desc);
  
!   /* Free the lhs after the function call and copy the result to
   the lhs descriptor.  */
tmp = gfc_conv_descriptor_data_get (desc);
tmp = gfc_call_free (fold_convert (pvoid_type_node, tmp));
gfc_add_expr_to_block (&se->post, tmp);
-   gfc_add_modify (&se->post, desc, res_desc);
  
!   offset = gfc_index_zero_node;
  
!   /* Now reset the bounds from zero based to unity based and set the
!  offset accordingly.  */
for (n = 0 ; n < rank; n++)
  {
!   tmp = gfc_conv_descriptor_ubound_get (desc, gfc_rank_cst[n]);
tmp = fold_build2_loc (input_location, PLUS_EXPR,
! 			 gfc_array_index_type,
! 			 tmp, gfc_index_one_node);
gfc_conv_descriptor_lbound_set (&se->post, desc,
!   gfc_rank_cst[n],
!   gfc_index_one_node);
gfc_conv_descriptor_ubound_set (&se->post, desc,
    gfc_rank_cst[n], tmp);
  
!   /* Accumulate the offset.  Since all lbounds are unity, offset
! 	 is just minus the sum of the strides.  */
tmp = gfc_conv_descriptor_stride_get (desc, gfc_rank_cst[n]);
offset = fold_build2_loc (input_location, MINUS_EXPR,
  gfc_array_index_type,
  offset, tmp);
--- 6290,6377 
tree res_desc;
tree tmp;
tree offset;
+   tree zero_cond;
int n;
  
/* Use the allocation done by the library.  Substitute the lhs
   descriptor with a copy, whose data field is nulled.*/
desc = build_fold_indirect_ref_loc (input_location, se->expr);
+ 
/* Unallocated, the descriptor does not have a dtype.  */
tmp = gfc_conv_descriptor_dtype (desc);
gfc_add_modify (&se->pre, tmp, gfc_get_dtype (TREE_TYPE (desc)));
+ 
res_desc = gfc_evaluate_now (desc, &se->pre);
gfc_conv_descriptor_data_set (&se->pre, res_desc, null_pointer_node);
se->expr = gfc_build_addr_expr (TREE_TYPE (se->expr), res_desc);
  
!   /* Free the lhs after the function call and copy the result data to
   the lhs descriptor.  */
tmp = gfc_conv_descriptor_data_get (desc);
+   zero_cond = fold_build2_loc (input_location, EQ_EXPR,
+ 			   boolean_type_node, tmp,
+ 			   build_int_cst (TREE_TYPE (tmp), 0));
+   zero_cond = gfc_evaluate_now (zero_cond, &se->post);
tmp = gfc_call_free (fold_convert (pvoid_type_node, tmp));
gfc_add_expr_to_block (&se->post, tmp);
  
!   tmp = gfc_conv_descriptor_data_get (res_desc);
!   gfc_conv_descriptor_data_set (&se->post, desc, tmp);
  
!   /* Check that the shapes are the same between lhs and expression.  */
!   for (n = 0 ; n < rank; n++)
! {
!   tree tmp1;
!   tmp = gfc_conv_descriptor_lbound_get (desc, gfc_rank_cst[n]);
!   tmp1 = gfc_conv_descriptor_lbound_get (res_desc, gfc_rank_cst[n]);
!   tmp = fold_build2_loc (input_location, MINUS_EXPR,
! 			 gfc_array_index_type, tmp, tmp1);
!   tmp1 = gfc_conv_descriptor_ubound_get (desc, gfc_rank_cst[n]);
!   tmp = fold_build2_loc (input_location, MINUS_EXPR,
!

Go patch committed: Permit importing a method to a type being defined

2012-02-02 Thread Ian Lance Taylor
In a slightly complex case (I am adding a test case to the master Go
testsuite) it is possible for the importer to see a method for a type
which is in the process of being imported and is therefore not fully
defined.  This was causing the compiler to complain about an undefined
type during the import.  This patch fixes the problem.  Bootstrapped and
ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 59d85958d284 go/gogo.cc
--- a/go/gogo.cc	Wed Feb 01 22:38:55 2012 -0800
+++ b/go/gogo.cc	Thu Feb 02 10:25:58 2012 -0800
@@ -880,7 +880,7 @@
   else if (rtype->forward_declaration_type() != NULL)
 	{
 	  Forward_declaration_type* ftype = rtype->forward_declaration_type();
-	  return ftype->add_method_declaration(name, type, location);
+	  return ftype->add_method_declaration(name, NULL, type, location);
 	}
   else
 	go_unreachable();
@@ -4325,11 +4325,12 @@
 
 Named_object*
 Type_declaration::add_method_declaration(const std::string&  name,
+	 Package* package,
 	 Function_type* type,
 	 Location location)
 {
-  Named_object* ret = Named_object::make_function_declaration(name, NULL, type,
-			  location);
+  Named_object* ret = Named_object::make_function_declaration(name, package,
+			  type, location);
   this->methods_.push_back(ret);
   return ret;
 }
diff -r 59d85958d284 go/gogo.h
--- a/go/gogo.h	Wed Feb 01 22:38:55 2012 -0800
+++ b/go/gogo.h	Thu Feb 02 10:25:58 2012 -0800
@@ -1621,8 +1621,8 @@
 
   // Add a method declaration to this type.
   Named_object*
-  add_method_declaration(const std::string& name, Function_type* type,
-			 Location location);
+  add_method_declaration(const std::string& name, Package*,
+			 Function_type* type, Location location);
 
   // Return whether any methods were defined.
   bool
diff -r 59d85958d284 go/import.cc
--- a/go/import.cc	Wed Feb 01 22:38:55 2012 -0800
+++ b/go/import.cc	Thu Feb 02 10:25:58 2012 -0800
@@ -441,12 +441,29 @@
   Named_object* no;
   if (fntype->is_method())
 {
-  Type* rtype = receiver->type()->deref();
+  Type* rtype = receiver->type();
+
+  // We may still be reading the definition of RTYPE, so we have
+  // to be careful to avoid calling base or convert.  If RTYPE is
+  // a named type or a forward declaration, then we know that it
+  // is not a pointer, because we are reading a method on RTYPE
+  // and named pointers can't have methods.
+
+  if (rtype->classification() == Type::TYPE_POINTER)
+	rtype = rtype->points_to();
+
   if (rtype->is_error_type())
 	return NULL;
-  Named_type* named_rtype = rtype->named_type();
-  go_assert(named_rtype != NULL);
-  no = named_rtype->add_method_declaration(name, package, fntype, loc);
+  else if (rtype->named_type() != NULL)
+	no = rtype->named_type()->add_method_declaration(name, package, fntype,
+			 loc);
+  else if (rtype->forward_declaration_type() != NULL)
+	no = rtype->forward_declaration_type()->add_method_declaration(name,
+   package,
+   fntype,
+   loc);
+  else
+	go_unreachable();
 }
   else
 {
@@ -647,8 +664,8 @@
 	{
 	  // We have seen this type before.  FIXME: it would be a good
 	  // idea to check that the two imported types are identical,
-	  // but we have not finalized the methds yet, which means
-	  // that we can nt reliably compare interface types.
+	  // but we have not finalized the methods yet, which means
+	  // that we can not reliably compare interface types.
 	  type = no->type_value();
 
 	  // Don't change the visibility of the existing type.
diff -r 59d85958d284 go/types.cc
--- a/go/types.cc	Wed Feb 01 22:38:55 2012 -0800
+++ b/go/types.cc	Thu Feb 02 10:25:58 2012 -0800
@@ -9115,6 +9115,7 @@
 
 Named_object*
 Forward_declaration_type::add_method_declaration(const std::string& name,
+		 Package* package,
 		 Function_type* type,
 		 Location location)
 {
@@ -9122,7 +9123,7 @@
   if (no->is_unknown())
 no->declare_as_type();
   Type_declaration* td = no->type_declaration_value();
-  return td->add_method_declaration(name, type, location);
+  return td->add_method_declaration(name, package, type, location);
 }
 
 // Traversal.
diff -r 59d85958d284 go/types.h
--- a/go/types.h	Wed Feb 01 22:38:55 2012 -0800
+++ b/go/types.h	Thu Feb 02 10:25:58 2012 -0800
@@ -2937,7 +2937,7 @@
 
   // Add a method declaration to this type.
   Named_object*
-  add_method_declaration(const std::string& name, Function_type*,
+  add_method_declaration(const std::string& name, Package*, Function_type*,
 			 Location);
 
  protected:


[patch libjava]: PR 48512 Avoid crtmt.o in startfile-spec for w64 based mingw toolchains

2012-02-02 Thread Kai Tietz
Hello,

ChangeLog

2012-02-02  Kai Tietz  

PR libjava/48512
* configure.ac (THREADSTARTFILESPEC): Don't add crtmet.o file for
w64 windows targets.
* configure: Regenerated.

Tested for i686-w64-mingw32, and i686-pc-mingw32.  Ok for apply?

Regards,
Kai

Index: gcc/libjava/configure.ac
===
--- gcc.orig/libjava/configure.ac
+++ gcc/libjava/configure.ac
@@ -1150,8 +1150,13 @@ case "$THREADS" in
 # FIXME: In Java we are able to detect thread death at the end of
 # Thread.run() so we should be able to clean up the exception handling
 # contexts ourselves.
-THREADSTARTFILESPEC='crtmt%O%s'
-;;
+case "$host" in
+*-w64-mingw*)
+  ;;
+*)
+  THREADSTARTFILESPEC='crtmt%O%s'
+  ;;
+esac

  none)
 THREADH=no-threads.h


Re: [PATCH] [MIPS] fix mips_prepend insn.

2012-02-02 Thread Richard Sandiford
Liu  writes:
> diff --git a/gcc/config/mips/mips-dspr2.md b/gcc/config/mips/mips-dspr2.md
> index 5ae902f..6853b9d 100644
> --- a/gcc/config/mips/mips-dspr2.md
> +++ b/gcc/config/mips/mips-dspr2.md
> @@ -345,7 +345,7 @@
> (set_attr "mode"  "SI")])
>  
>  (define_insn "mips_prepend"
> -  [(set (match_operand:SI 0 "register_operand" "=d")
> +  [(set (match_operand:DI 0 "register_operand" "=d")
>   (unspec:SI [(match_operand:SI 1 "register_operand" "0")
>   (match_operand:SI 2 "reg_or_0_operand" "dJ")
>   (match_operand:SI 3 "const_int_operand" "n")]

This pattern maps directly to __builtin_mips_prepend, which is defined
to take and return an SI type, so :SI is the correct choice here.

I agree it might be nice to have a function that operates on 64-bit
values for 64-bit targets though.  For compatibility reasons,
we'd probably have to define both a new function and a new pattern
(or at least an iterator version of the current pattern).
There's currently no way of generating PREPENDD or PREPENDW either.

I consider this API to be owned by MTI, so we'd need to coordinate
with them.  Chao-Ying, what do you think?  Do you already have
something like this internally?

> @@ -353,7 +353,7 @@
>"ISA_HAS_DSPR2"
>  {
>if (INTVAL (operands[3]) & ~(unsigned HOST_WIDE_INT) 31)
> -operands[2] = GEN_INT (INTVAL (operands[2]) & 31);
> +operands[3] = GEN_INT (INTVAL (operands[3]) & 31);
>return "prepend\t%0,%z2,%3";
>  }
>[(set_attr "type"  "arith")

This part's obviously correct though, thanks.  Applied as below.

(Since it isn't a regression, I suppose it shouldn't strictly speaking
be applied during stage 4.  It's very isolated and fixes an obvious typo
though, so I hope no-one minds that I'm making an exception.)

Richard


gcc/
2012-02-02  Jia Liu  

* config/mips/mips-dspr2.md (mips_prepend): Mask operand 3 rather
than operand 2.

gcc/testsuite/
* gcc.target/mips/mips-prepend-1.c: New test.

Index: gcc/config/mips/mips-dspr2.md
===
--- gcc/config/mips/mips-dspr2.md   2012-02-02 18:32:00.0 +
+++ gcc/config/mips/mips-dspr2.md   2012-02-02 18:32:11.0 +
@@ -353,7 +353,7 @@ (define_insn "mips_prepend"
   "ISA_HAS_DSPR2"
 {
   if (INTVAL (operands[3]) & ~(unsigned HOST_WIDE_INT) 31)
-operands[2] = GEN_INT (INTVAL (operands[2]) & 31);
+operands[3] = GEN_INT (INTVAL (operands[3]) & 31);
   return "prepend\t%0,%z2,%3";
 }
   [(set_attr "type""arith")
Index: gcc/testsuite/gcc.target/mips/mips-prepend-1.c
===
--- /dev/null   2012-02-02 18:24:25.322640797 +
+++ gcc/testsuite/gcc.target/mips/mips-prepend-1.c  2012-02-02 
18:31:04.0 +
@@ -0,0 +1,8 @@
+/* { dg-options "-mdspr2" } */
+/* { dg-final { scan-assembler "prepend\[^\n\]*,10" } } */
+
+NOMIPS16 int
+foo (int x, int y)
+{
+  return __builtin_mips_prepend (x, y, 42);
+}


Merge from trunk to gccgo branch

2012-02-02 Thread Ian Lance Taylor
I have once again merged trunk to gccgo branch, revision 183840 this
time.

Ian


Re: [trans-mem, PATCH] do not dereference node if null in expand_call_tm (PR middle-end/52047)

2012-02-02 Thread Patrick Marlier

On 02/02/2012 04:22 AM, Richard Guenther wrote:

On Wed, Feb 1, 2012 at 10:19 PM, Patrick Marlier
  wrote:

On 02/01/2012 03:59 AM, Richard Guenther wrote:


The patch looks ok, but I'm not sure why you do not get a cgraph node
here - cgraph doesn't really care for builtins as far as I can see.
  Honza?


I cannot help on this...



Don't you maybe want to handle builtins in a special way here?


Indeed, I think my patch is wrong. __builtin_prefetch should have the
transaction_pure attribute. I don't know how usually it should be done but
what about adding a gcc_assert before to dereference node (potentially
NULL)?

How the attached patch looks like now?
(Tested on i686)



At least those that are const/pure?

About const/pure, we cannot consider those functions as transaction_pure
because they can read global and shared variable.


Well, const functions cannot access global memory, they can only inspect
their arguments.


Actually, in this example, GCC does not complain at all about those 
const functions which read global memory... Should it? or the user is 
allowed to indicate const even if it is not?


static int a = 0;

int f (int* a) __attribute__((const));
int f (int* a)
{
  return *a+1;
}

int g () __attribute__((const));
int g ()
{
  return a+1;
}

int main()
{
  int tmp = 0;
  __transaction_atomic {
tmp += f(&a);
tmp += g();
  }

  return tmp;
}

In this test, the transaction is completely removed since no 
transactional operation is detected.



Of course __builtin_prefetch seems to be special in some way.  Note that
users can explicitely call it, so setting the attribute from the prefetching
pass isn't the correct thing to do.

If the user calls it explicitly, at least the compiler doesn't ICE.
Ex: error: unsafe function call ‘__builtin_prefetch’ within atomic 
transaction.
So what's your proposal? to add directly in builtins.def the 
transaction_pure attribute to required functions?



Note that __builtin_prefetch has the 'no vops' attribute - I think you should
simply consider all 'no vops' builtins as transaction pure instead or
explicitely
consider a set of builtins as transaction pure (that's more scalable than
sticking the attribute onto random builtins - see how we handle builtins in
the alias machinery, we avoid sticking the fnspec attribute onto each
builtin but simply have special handling for them).


'no vops' attribute description is quite vague. "may have arbitrary side 
effects" scares me ;)
I will have a look at this alias machinery (any files particularly? 
tree.c seems to have a lot of builtins things in it).


In this patch, I do manage 'no vops' the same way as const is.
Attached the new version where we add ECF_TM_PURE to NOVOPS functions.
Tested on x86_64-unknown-linux-gnu.

Thanks!
Patrick.

2012-02-02  Patrick Marlier  

PR middle-end/52047
* trans-mem.c (expand_call_tm): Add an assertion.
* calls.c (flags_from_decl_or_type): Add ECF_TM_PURE to 'no vops'
functions.



Thanks,
Richard.


BTW, I will post a PR (and probably a patch) about this.

Thanks for your comment!

Patrick.

PR middle-end/52047
* trans-mem.c (expand_call_tm): Add an assertion.
* tree-ssa-loop-prefetch.c (issue_prefetch_ref): Add transaction_pure
attribute to __builtin_prefetch.
(tree_ssa_prefetch_arrays): Likewise.


Index: calls.c
===
--- calls.c	(revision 183837)
+++ calls.c	(working copy)
@@ -716,7 +716,7 @@ flags_from_decl_or_type (const_tree exp)
 	{
 	  if (is_tm_builtin (exp))
 	flags |= ECF_TM_BUILTIN;
-	  else if ((flags & ECF_CONST) != 0
+	  else if ((flags & (ECF_CONST|ECF_NOVOPS)) != 0
 		   || lookup_attribute ("transaction_pure",
 	TYPE_ATTRIBUTES (TREE_TYPE (exp
 	flags |= ECF_TM_PURE;
Index: trans-mem.c
===
--- trans-mem.c	(revision 183837)
+++ trans-mem.c	(working copy)
@@ -1,5 +1,5 @@
 /* Passes for transactional memory support.
-   Copyright (C) 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
+   Copyright (C) 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
 
This file is part of GCC.
 
@@ -2267,6 +2267,8 @@ expand_call_tm (struct tm_region *region,
 }
 
   node = cgraph_get_node (fn_decl);
+  /* All calls should have cgraph here. */
+  gcc_assert (node);
   if (node->local.tm_may_enter_irr)
 transaction_subcode_ior (region, GTMA_MAY_ENTER_IRREVOCABLE);
 
Index: testsuite/gcc.dg/tm/pr52047.c
===
--- testsuite/gcc.dg/tm/pr52047.c	(revision 0)
+++ testsuite/gcc.dg/tm/pr52047.c	(revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fgnu-tm -fprefetch-loop-arrays -w" } */
+
+int test2 (int x[])
+{
+  return x[12];
+}
+
+int test1 (void)
+{
+  int x[1000], i;
+  for (i = 0; i < 1000; i++)
+x[i] = i;
+  return test2 (x);
+}
+
+int
+main ()
+{
+  __t

Ping: Fix MIPS va_arg regression

2012-02-02 Thread Richard Sandiford
Ping for:

http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01564.html

which fixes a MIPS va_arg regression (admittedly a long-standing one)
on zero-sized types.  There are no functional changes to other targets
and I'm as confident as I can be that it's safe for MIPS.

(It hasn't been a full week, but I'm nervous about missing the stage 4
cut-off.)

Thanks,
Richard



Re: [Patch, fortran] PR 51808 Heap allocate binding labels

2012-02-02 Thread Janne Blomqvist
On Tue, Jan 31, 2012 at 02:46, Gerald Pfeifer  wrote:
> On Sun, 29 Jan 2012, Janne Blomqvist wrote:
>>> .../gcc-HEAD/gcc/fortran/decl.c:5820:23: error: invalid conversion from 
>>> 'const char*' to 'char*' [-fpermissive]
>>> gmake[3]: *** [fortran/decl.o] Error 1
>> Have you tried r183679, which should fix this?
>
> Yes, I now tried that update (my daily tester yesterday picked a state
> of the tree without that changeset) and bootstrap passed just fine.  I
> had actually looked for follow-up patches yesterday, but did not see
> any on the list (possibly missed it).

Ah, it seems that in my hurry I forgot to send in to the list. Here it
is for completeness:

2012-01-29  Janne Blomqvist  

PR fortran/51808
* decl.c (set_binding_label): Make binding_label argument const.
(curr_binding_label): Constify.
* gfortran.h (gfc_symbol): Constify binding_label.
(gfc_common_head): Likewise.
(get_iso_c_sym): Likewise.
* match.c (gfc_match_name_C): Constify buffer argument.
* match.h (gfc_match_name_C): Likewise.
* resolve.c (set_name_and_label): Constify binding_label argument.
(gfc_iso_c_sub_interface): Constify binding_label variable.
* symbol.c (get_iso_c_sym): Constify binding_label argument.




-- 
Janne Blomqvist
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 0cfb0ef..c87fc1b 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -35,7 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #define gfc_get_data() XCNEW (gfc_data)
 
 
-static gfc_try set_binding_label (char **, const char *, int);
+static gfc_try set_binding_label (const char **, const char *, int);
 
 
 /* This flag is set if an old-style length selector is matched
@@ -55,7 +55,7 @@ static gfc_array_spec *current_as;
 static int colon_seen;
 
 /* The current binding label (if any).  */
-static char* curr_binding_label;
+static const char* curr_binding_label;
 /* Need to know how many identifiers are on the current data declaration
line in case we're given the BIND(C) attribute with a NAME= specifier.  */
 static int num_idents_on_line;
@@ -3808,7 +3808,8 @@ cleanup:
there is more than one argument (num_idents), it is an error.  */
 
 static gfc_try
-set_binding_label (char **dest_label, const char *sym_name, int num_idents)
+set_binding_label (const char **dest_label, const char *sym_name, 
+		   int num_idents)
 {
   if (num_idents > 1 && has_name_equals)
 {
@@ -5713,7 +5714,7 @@ match
 gfc_match_bind_c (gfc_symbol *sym, bool allow_binding_name)
 {
   /* binding label, if exists */   
-  char* binding_label = NULL;
+  const char* binding_label = NULL;
   match double_quote;
   match single_quote;
 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index bf9a1f9..6f49d61 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1237,7 +1237,7 @@ typedef struct gfc_symbol
 
   /* This may be repetitive, since the typespec now has a binding
  label field.  */
-  char* binding_label;
+  const char* binding_label;
   /* Store a reference to the common_block, if this symbol is in one.  */
   struct gfc_common_head *common_block;
 
@@ -1254,7 +1254,7 @@ typedef struct gfc_common_head
   char use_assoc, saved, threadprivate;
   char name[GFC_MAX_SYMBOL_LEN + 1];
   struct gfc_symbol *head;
-  char* binding_label;
+  const char* binding_label;
   int is_bind_c;
 }
 gfc_common_head;
@@ -2595,7 +2595,7 @@ gfc_try verify_bind_c_sym (gfc_symbol *, gfc_typespec *, int, gfc_common_head *)
 gfc_try verify_bind_c_derived_type (gfc_symbol *);
 gfc_try verify_com_block_vars_c_interop (gfc_common_head *);
 void generate_isocbinding_symbol (const char *, iso_c_binding_symbol, const char *);
-gfc_symbol *get_iso_c_sym (gfc_symbol *, char *, char *, int);
+gfc_symbol *get_iso_c_sym (gfc_symbol *, char *, const char *, int);
 int gfc_get_sym_tree (const char *, gfc_namespace *, gfc_symtree **, bool);
 int gfc_get_ha_symbol (const char *, gfc_symbol **);
 int gfc_get_ha_sym_tree (const char *, gfc_symtree **);
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 3024cc7..89b59bc 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -581,7 +581,7 @@ gfc_match_name (char *buffer)
we successfully match a C name.  */
 
 match
-gfc_match_name_C (char **buffer)
+gfc_match_name_C (const char **buffer)
 {
   locus old_loc;
   size_t i = 0;
diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 642c437..029faf7 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -52,7 +52,7 @@ match gfc_match_label (void);
 match gfc_match_small_int (int *);
 match gfc_match_small_int_expr (int *, gfc_expr **);
 match gfc_match_name (char *);
-match gfc_match_name_C (char **buffer);
+match gfc_match_name_C (const char **buffer);
 match gfc_match_symbol (gfc_symbol **, int);
 match gfc_match_sym_tree (gfc_symtree **, int);
 match gfc_match_intrinsic_op (gfc_intrinsic_op *);
diff --git a/gcc/fortran/resolve.c b/gcc/

[PATCH] Fix RTL sharing bug in loop unswitching (PR rtl-optimization/52092)

2012-02-02 Thread Jakub Jelinek
Hi!

It seems loop-iv.c happily creates shared RTL in lots of places,
loop-unroll.c solves that by unsharing everything it adds, this is
an attempt to do the same in unswitching to unshare everything iv_analyze
came up with.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-02-02  Jakub Jelinek  

PR rtl-optimization/52092
* loop-unswitch.c (may_unswitch_on): Call copy_rtx_if_shared
on get_iv_value result.

* gcc.c-torture/compile/pr52092.c: New test.

--- gcc/loop-unswitch.c.jj  2010-06-28 15:36:30.0 +0200
+++ gcc/loop-unswitch.c 2012-02-02 14:00:55.127749909 +0100
@@ -204,6 +204,7 @@ may_unswitch_on (basic_block bb, struct
return NULL_RTX;
 
   op[i] = get_iv_value (&iv, const0_rtx);
+  op[i] = copy_rtx_if_shared (op[i]);
 }
 
   mode = GET_MODE (op[0]);
--- gcc/testsuite/gcc.c-torture/compile/pr52092.c.jj2012-02-02 
14:11:37.722031008 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr52092.c   2012-02-02 
14:11:14.0 +0100
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/52092 */
+
+int a, b, c, d, e, f, g;
+
+void
+foo (void)
+{
+  for (;;)
+{
+  int *h = 0;
+  int i = 3;
+  int **j = &h;
+  if (e)
+   {
+ c = d || a;
+ g = c ? a : b;
+ if ((char) (i * g))
+   {
+ h = &f;
+ *h = 0;
+   }
+ **j = 0;
+   }
+}
+}

Jakub


[PATCH] Don't print extra newline after message that warnings are treated as errors (PR middle-end/48071)

2012-02-02 Thread Jakub Jelinek
Hi!

pp_flush already adds a newline, so we shouldn't add it in pp_verbatim
as well, that results in printing a blank newline after this message.

Bootstrapped/regtested onx x86_64-linux and i686-linux, ok for trunk?

2012-02-02  Jakub Jelinek  

PR middle-end/48071
* diagnostic.c (diagnostic_finish): Remove trailing newlines.

--- gcc/diagnostic.c.jj 2011-10-17 22:27:42.0 +0200
+++ gcc/diagnostic.c2012-02-02 15:05:51.578520230 +0100
@@ -133,12 +133,12 @@ diagnostic_finish (diagnostic_context *c
   /* -Werror was given.  */
   if (context->warning_as_error_requested)
pp_verbatim (context->printer,
-_("%s: all warnings being treated as errors\n"),
+_("%s: all warnings being treated as errors"),
 progname);
   /* At least one -Werror= was given.  */
   else
pp_verbatim (context->printer,
-_("%s: some warnings being treated as errors\n"),
+_("%s: some warnings being treated as errors"),
 progname);
   pp_flush (context->printer);
 }

Jakub


patch for PR49800

2012-02-02 Thread Vladimir Makarov
The following patch solves PR49800 which is described on 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49800.


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 183843.

2012-02-02  Vladimir Makarov 

PR rtl-optimization/49800
* haifa-sched.c (sched_init): Call regstat_init_n_sets_and_refs.
(sched_finish): Call regstat_free_n_sets_and_refs.


Index: haifa-sched.c
===
--- haifa-sched.c   (revision 183839)
+++ haifa-sched.c   (working copy)
@@ -4835,6 +4835,10 @@ sched_init (void)
 {
   int i, max_regno = max_reg_num ();
 
+  if (sched_dump != NULL)
+   /* We need info about pseudos for rtl dumps about pseudo
+  classes and costs.  */
+   regstat_init_n_sets_and_refs ();
   ira_set_pseudo_classes (sched_verbose ? sched_dump : NULL);
   sched_regno_pressure_class
= (enum reg_class *) xmalloc (max_regno * sizeof (enum reg_class));
@@ -4946,6 +4950,8 @@ sched_finish (void)
   haifa_finish_h_i_d ();
   if (sched_pressure_p)
 {
+  if (regstat_n_sets_and_refs != NULL)
+   regstat_free_n_sets_and_refs ();
   free (sched_regno_pressure_class);
   BITMAP_FREE (region_ref_regs);
   BITMAP_FREE (saved_reg_live);


[v3] libstdc++/52068

2012-02-02 Thread Benjamin Kosnik

Remove install weirdness from 49829 fix.

tested x86/linux

-benjamin2012-02-02  Benjamin Kosnik  

	PR libstdc++/52068
	* src/c++11/Makefile.am (toolexeclib_LTLIBRARIES,
	libc__11_la_SOURCES): Remove.
	* src/c++11/Makefile.in: Regenerate.
	* src/c++98/Makefile.am (toolexeclib_LTLIBRARIES,
	libc__98_la_SOURCES): Remove.
	* src/c++98/Makefile.in: Regenerate.

diff --git a/libstdc++-v3/src/c++11/Makefile.am b/libstdc++-v3/src/c++11/Makefile.am
index cc454bb..395af5c 100644
--- a/libstdc++-v3/src/c++11/Makefile.am
+++ b/libstdc++-v3/src/c++11/Makefile.am
@@ -25,7 +25,6 @@
 include $(top_srcdir)/fragment.am
 
 # Convenience library for C++11 runtime.
-toolexeclib_LTLIBRARIES = libc++11.la
 noinst_LTLIBRARIES = libc++11convenience.la
 
 headers =
@@ -63,7 +62,6 @@ endif
 vpath % $(top_srcdir)/src/c++11
 vpath % $(top_srcdir)
 
-libc__11_la_SOURCES = $(sources) $(inst_sources)
 libc__11convenience_la_SOURCES = $(sources)  $(inst_sources)
 
 # AM_CXXFLAGS needs to be in each subdirectory so that it can be
diff --git a/libstdc++-v3/src/c++98/Makefile.am b/libstdc++-v3/src/c++98/Makefile.am
index d5d39d1..e960d94 100644
--- a/libstdc++-v3/src/c++98/Makefile.am
+++ b/libstdc++-v3/src/c++98/Makefile.am
@@ -25,7 +25,6 @@
 include $(top_srcdir)/fragment.am
 
 # Convenience library for C++98 runtime.
-toolexeclib_LTLIBRARIES = libc++98.la
 noinst_LTLIBRARIES = libc++98convenience.la
 
 headers =
@@ -156,7 +155,6 @@ sources = \
 vpath % $(top_srcdir)/src/c++98
 vpath % $(top_srcdir)
 
-libc__98_la_SOURCES = $(sources)
 libc__98convenience_la_SOURCES = $(sources)
 
 # Use special rules for the deprecated source files so that they find


Re: [PATCH] Don't print extra newline after message that warnings are treated as errors (PR middle-end/48071)

2012-02-02 Thread Gabriel Dos Reis
On Thu, Feb 2, 2012 at 1:14 PM, Jakub Jelinek  wrote:
> Hi!
>
> pp_flush already adds a newline, so we shouldn't add it in pp_verbatim
> as well, that results in printing a blank newline after this message.
>
> Bootstrapped/regtested onx x86_64-linux and i686-linux, ok for trunk?

Yes! Thanks for catching this.

-- Gaby


Re: [Patch, fortran] PR52012 - [4.6/4.7 Regression] Wrong-code with realloc on assignment and RESHAPE w/ ORDER=

2012-02-02 Thread Tobias Burnus

Dear Paul,

Paul Richard Thomas wrote:

Following our exchanges with Dominique, I think that the attached
patch will have to do for now.
Bootstrapped and regtested on FC9/x86_64 - OK for trunk?


The patch looks fine. Thanks. Can you also back-port to 4.6?

Tobias


2012-02-02  Paul Thomas

PR fortran/52012
* trans-expr.c (fcncall_realloc_result): If variable shape is
correct, retain the bounds, whatever they are.

2012-02-02  Paul Thomas

PR fortran/52012
* gfortran.dg/realloc_on_assign_11.f90: New test.


[patch windows]: PR target/40068

2012-02-02 Thread Kai Tietz
Hi,

the following patch -  Dave it is a derived variant from on of your
patches for a similar issue - takes care that for dllexport'ed classes
the typeinfo base declaration gets the dllexport-attribute, too.

ChangeLog

2012-02-02  Kai Tietz  
  Dave Korn

* config/i386/winnt-cxx.c (i386_pe_adjust_class_at_definition):
Take care that typinfo gets dllexport-attribute.

Regression tested for i686-w64-mingw32, and x86_64-w64-mingw32.  If
there is no objection within next 4 days, I will apply this patch.

Regards,
Kai

Index: gcc/gcc/config/i386/winnt-cxx.c
===
--- gcc.orig/gcc/config/i386/winnt-cxx.c
+++ gcc/gcc/config/i386/winnt-cxx.c
@@ -97,6 +97,20 @@ i386_pe_adjust_class_at_definition (tree

   if (lookup_attribute ("dllexport", TYPE_ATTRIBUTES (t)) != NULL_TREE)
 {
+  tree tmv = TYPE_MAIN_VARIANT (t);
+
+  /* Make sure that we set dllexport attribute to typeinfo's
+base declaration, as otherwise it would fail to be exported as
+it isn't a class-member.  */
+  if (tmv != NULL_TREE
+ && CLASS_TYPE_TYPEINFO_VAR (tmv) != NULL_TREE)
+   {
+ tree na, ti_decl = CLASSTYPE_TYPEINFO_VAR (tmv);
+ na = tree_cons (get_identifier ("dllexport"), NULL_TREE,
+ NULL_TREE);
+ decl_attributes (&ti_decl, na, 0);
+   }
+
   /* Check static VAR_DECL's.  */
   for (member = TYPE_FIELDS (t); member; member = DECL_CHAIN (member))
if (TREE_CODE (member) == VAR_DECL)


[Patch, Fortran] PR 52093 - fix simplification of SIZE((x))

2012-02-02 Thread Tobias Burnus

Dear all,

I have committed the attached patch as obvious (Rev.183848) - after 
building and regtesting on x86-64-linux.


I intent to backport it to 4.6 soon. (It's a regression.)

Tobias
2012-02-02  Tobias Burnus  

	PR fortran/52093
	* simplify.c (gfc_simplify_size): Handle INTRINSIC_PARENTHESES.

2012-02-02  Tobias Burnus  

	PR fortran/52093
	* gfortran.dg/shape_7.f90: New.

Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c	(Revision 183846)
+++ gcc/fortran/simplify.c	(Arbeitskopie)
@@ -5541,6 +5541,7 @@ gfc_simplify_size (gfc_expr *array, gfc_expr *dim,
 	  case INTRINSIC_NOT:
 	  case INTRINSIC_UPLUS:
 	  case INTRINSIC_UMINUS:
+	  case INTRINSIC_PARENTHESES:
 	replacement = array->value.op.op1;
 	break;
 
Index: gcc/testsuite/gfortran.dg/shape_7.f90
===
--- gcc/testsuite/gfortran.dg/shape_7.f90	(Revision 0)
+++ gcc/testsuite/gfortran.dg/shape_7.f90	(Arbeitskopie)
@@ -0,0 +1,32 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+! PR fortran/52093
+!
+! Contributed by Mohammad Rahmani
+!
+
+Program Main
+ Implicit None
+ Integer:: X(2,2)
+ Integer:: X2(7:11,8:9)
+
+ if (size((X)) /= 4) call abort ()
+ if (any (Shape((X))  /= [2,2])) call abort ()
+ if (any (lbound((X)) /= [1,1])) call abort ()
+ if (any (ubound((X)) /= [2,2])) call abort ()
+
+ if (size(X2) /= 10) call abort ()
+ if (any (Shape(X2)  /= [5,2])) call abort ()
+ if (any (lbound(X2) /= [7,8]))  call abort ()
+ if (any (ubound(X2) /= [11,9])) call abort ()
+
+ if (size((X2)) /= 10) call abort ()
+ if (any (Shape((X2))  /= [5,2])) call abort ()
+ if (any (lbound((X2)) /= [1,1])) call abort ()
+ if (any (ubound((X2)) /= [5,2])) call abort ()
+End Program Main
+
+! { dg-final { scan-tree-dump-times "abort" 0 "original" } }
+! { dg-final { cleanup-tree-dump "original" } }
+


Go patch committed: Compare slice start and end with cap, not len

2012-02-02 Thread Ian Lance Taylor
The Go spec says that in a slice expression, the start and end indexes
are compared with the capacity of the slice value being sliced, not its
length.  Somehow gccgo was doing OK getting that wrong.  This patch
fixes it.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r ce6462aab400 go/expressions.cc
--- a/go/expressions.cc	Thu Feb 02 10:30:35 2012 -0800
+++ b/go/expressions.cc	Thu Feb 02 13:15:49 2012 -0800
@@ -10649,11 +10649,28 @@
 
   if (array_type->length() == NULL && !DECL_P(array_tree))
 array_tree = save_expr(array_tree);
-  tree length_tree = array_type->length_tree(gogo, array_tree);
-  if (length_tree == error_mark_node)
-return error_mark_node;
-  length_tree = save_expr(length_tree);
-  tree length_type = TREE_TYPE(length_tree);
+
+  tree length_tree = NULL_TREE;
+  if (this->end_ == NULL || this->end_->is_nil_expression())
+{
+  length_tree = array_type->length_tree(gogo, array_tree);
+  if (length_tree == error_mark_node)
+	return error_mark_node;
+  length_tree = save_expr(length_tree);
+}
+
+  tree capacity_tree = NULL_TREE;
+  if (this->end_ != NULL)
+{
+  capacity_tree = array_type->capacity_tree(gogo, array_tree);
+  if (capacity_tree == error_mark_node)
+	return error_mark_node;
+  capacity_tree = save_expr(capacity_tree);
+}
+
+  tree length_type = (length_tree != NULL_TREE
+		  ? TREE_TYPE(length_tree)
+		  : TREE_TYPE(capacity_tree));
 
   tree bad_index = boolean_false_node;
 
@@ -10676,7 +10693,9 @@
 	   ? GE_EXPR
 	   : GT_EXPR),
 	  boolean_type_node, start_tree,
-	  length_tree));
+	  (this->end_ == NULL
+	   ? length_tree
+	   : capacity_tree)));
 
   int code = (array_type->length() != NULL
 	  ? (this->end_ == NULL
@@ -10723,12 +10742,6 @@
 
   // Array slice.
 
-  tree capacity_tree = array_type->capacity_tree(gogo, array_tree);
-  if (capacity_tree == error_mark_node)
-return error_mark_node;
-  capacity_tree = fold_convert_loc(loc.gcc_location(), length_type,
-   capacity_tree);
-
   tree end_tree;
   if (this->end_->is_nil_expression())
 end_tree = length_tree;
@@ -10747,7 +10760,6 @@
 
   end_tree = fold_convert_loc(loc.gcc_location(), length_type, end_tree);
 
-  capacity_tree = save_expr(capacity_tree);
   tree bad_end = fold_build2_loc(loc.gcc_location(), TRUTH_OR_EXPR,
  boolean_type_node,
  fold_build2_loc(loc.gcc_location(),
diff -r ce6462aab400 go/types.cc
--- a/go/types.cc	Thu Feb 02 10:30:35 2012 -0800
+++ b/go/types.cc	Thu Feb 02 13:15:49 2012 -0800
@@ -5416,7 +5416,8 @@
 Array_type::capacity_tree(Gogo* gogo, tree array)
 {
   if (this->length_ != NULL)
-return omit_one_operand(sizetype, this->get_length_tree(gogo), array);
+return omit_one_operand(integer_type_node, this->get_length_tree(gogo),
+			array);
 
   // This is an open array.  We need to read the capacity field.
 


[googlg][4.6] curb the counter scaling facor in inline transform (issue5622052)

2012-02-02 Thread Rong Xu
Hi,

This patch curbs the counter scaling factor that causing counter
overflow in inline transformation. The negavie counter triggers
a later pass assertion. 

Tested: inertnal performance benchmarks. 

-Rong

2012-02-02   Rong Xu  

* tree-inline.c (copy_cfg_body): Curb the scaling factor to
avoid counter overflow.

Index: tree-inline.c
===
--- tree-inline.c   (revision 183768)
+++ tree-inline.c   (working copy)
@@ -2198,8 +2198,49 @@
   gcov_type incoming_count = 0;
 
   if (ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count)
-count_scale = (REG_BR_PROB_BASE * (double)count
-  / ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count);
+{
+  struct cgraph_node *node = cgraph_node (callee_fndecl);
+  double f_max;
+  gcov_type max_count_scale;
+  gcov_type callee_max_bb_cnt;
+  gcov_type max_value = ((gcov_type) 1 << ((sizeof(gcov_type) * 8) - 1));
+  max_value = ~max_value;
+  count_scale = (REG_BR_PROB_BASE * (double)count
+  / ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count);
+
+  /* Reducing the scaling factor when it can cause counter overflow.
+ This can happen for comdat functions where the counters are split.
+ It's more likely for recursive inlines.  */
+  gcc_assert (node);
+  callee_max_bb_cnt = node->max_bb_count;
+
+  if (callee_max_bb_cnt == 0)
+{
+  gcov_type c = ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count;
+
+  FOR_EACH_BB_FN (bb, src_cfun)
+if (bb->count > node->max_bb_count)
+  callee_max_bb_cnt = node->max_bb_count;
+
+  if (c > callee_max_bb_cnt)
+callee_max_bb_cnt = c;
+
+  node->max_bb_count = callee_max_bb_cnt;
+}
+
+  f_max = (double) max_value * REG_BR_PROB_BASE / callee_max_bb_cnt - 1;
+  if (f_max > max_value)
+max_count_scale = max_value;
+  else
+max_count_scale = f_max;
+
+  if (count_scale > max_count_scale)
+{
+  if (flag_opt_info >= OPT_INFO_MED)
+warning (0, "Reducing scaling factor to avoid counter overflow.");
+  count_scale = max_count_scale;
+}
+}
   else
 count_scale = REG_BR_PROB_BASE;
 
@@ -2221,7 +2262,7 @@
incoming_frequency += EDGE_FREQUENCY (e);
incoming_count += e->count;
  }
-  incoming_count = incoming_count * count_scale / REG_BR_PROB_BASE;
+  incoming_count = ((double) incoming_count) * count_scale / 
REG_BR_PROB_BASE;
   incoming_frequency
= incoming_frequency * frequency_scale / REG_BR_PROB_BASE;
   ENTRY_BLOCK_PTR->count = incoming_count;

--
This patch is available for review at http://codereview.appspot.com/5622052


Re: [googlg][4.6] curb the counter scaling facor in inline transform (issue5622052)

2012-02-02 Thread Xinliang David Li
ok for google branches.

David

On Thu, Feb 2, 2012 at 2:30 PM, Rong Xu  wrote:
> Hi,
>
> This patch curbs the counter scaling factor that causing counter
> overflow in inline transformation. The negavie counter triggers
> a later pass assertion.
>
> Tested: inertnal performance benchmarks.
>
> -Rong
>
> 2012-02-02   Rong Xu  
>
>        * tree-inline.c (copy_cfg_body): Curb the scaling factor to
>        avoid counter overflow.
>
> Index: tree-inline.c
> ===
> --- tree-inline.c       (revision 183768)
> +++ tree-inline.c       (working copy)
> @@ -2198,8 +2198,49 @@
>   gcov_type incoming_count = 0;
>
>   if (ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count)
> -    count_scale = (REG_BR_PROB_BASE * (double)count
> -                  / ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count);
> +    {
> +      struct cgraph_node *node = cgraph_node (callee_fndecl);
> +      double f_max;
> +      gcov_type max_count_scale;
> +      gcov_type callee_max_bb_cnt;
> +      gcov_type max_value = ((gcov_type) 1 << ((sizeof(gcov_type) * 8) - 1));
> +      max_value = ~max_value;
> +      count_scale = (REG_BR_PROB_BASE * (double)count
> +                  / ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count);
> +
> +      /* Reducing the scaling factor when it can cause counter overflow.
> +         This can happen for comdat functions where the counters are split.
> +         It's more likely for recursive inlines.  */
> +      gcc_assert (node);
> +      callee_max_bb_cnt = node->max_bb_count;
> +
> +      if (callee_max_bb_cnt == 0)
> +        {
> +          gcov_type c = ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count;
> +
> +          FOR_EACH_BB_FN (bb, src_cfun)
> +            if (bb->count > node->max_bb_count)
> +              callee_max_bb_cnt = node->max_bb_count;
> +
> +          if (c > callee_max_bb_cnt)
> +            callee_max_bb_cnt = c;
> +
> +          node->max_bb_count = callee_max_bb_cnt;
> +        }
> +
> +      f_max = (double) max_value * REG_BR_PROB_BASE / callee_max_bb_cnt - 1;
> +      if (f_max > max_value)
> +        max_count_scale = max_value;
> +      else
> +        max_count_scale = f_max;
> +
> +      if (count_scale > max_count_scale)
> +        {
> +          if (flag_opt_info >= OPT_INFO_MED)
> +            warning (0, "Reducing scaling factor to avoid counter 
> overflow.");
> +          count_scale = max_count_scale;
> +        }
> +    }
>   else
>     count_scale = REG_BR_PROB_BASE;
>
> @@ -2221,7 +2262,7 @@
>            incoming_frequency += EDGE_FREQUENCY (e);
>            incoming_count += e->count;
>          }
> -      incoming_count = incoming_count * count_scale / REG_BR_PROB_BASE;
> +      incoming_count = ((double) incoming_count) * count_scale / 
> REG_BR_PROB_BASE;
>       incoming_frequency
>        = incoming_frequency * frequency_scale / REG_BR_PROB_BASE;
>       ENTRY_BLOCK_PTR->count = incoming_count;
>
> --
> This patch is available for review at http://codereview.appspot.com/5622052


libgo patch committed: Correct ENOSYS functions

2012-02-02 Thread Ian Lance Taylor
I foolishly messed up writing the ENOSYS functions for systems which
don't have them.  I had them return ENOSYS, rather than setting errno
and returning an error indicator.  This patch corrects this oversight.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r a79fa0b4c24e libgo/runtime/go-nosys.c
--- a/libgo/runtime/go-nosys.c	Thu Feb 02 14:21:14 2012 -0800
+++ b/libgo/runtime/go-nosys.c	Thu Feb 02 14:57:54 2012 -0800
@@ -31,7 +31,8 @@
 int
 epoll_create1 (int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -42,7 +43,8 @@
 	   int mode __attribute__ ((unused)),
 	   int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -53,7 +55,8 @@
 	  mode_t mode __attribute__ ((unused)),
 	  int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -65,7 +68,8 @@
 	  gid_t group __attribute__ ((unused)),
 	  int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -75,7 +79,8 @@
 	   const char *pathname __attribute__ ((unused)),
 	   const struct timeval times[2] __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -85,7 +90,8 @@
 		   const char* pathname __attribute__ ((unused)),
 		   uint32_t mask __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -93,7 +99,8 @@
 int
 inotify_init (void)
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -102,7 +109,8 @@
 inotify_rm_watch (int fd __attribute__ ((unused)),
 		  uint32_t wd __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -112,7 +120,8 @@
 	 const char *pathname __attribute__ ((unused)),
 	 mode_t mode __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -123,7 +132,8 @@
 	 mode_t mode __attribute__ ((unused)),
 	 dev_t dev __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -134,7 +144,8 @@
 	int oflag __attribute__ ((unused)),
 	...)
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -145,7 +156,8 @@
 	  int newdirfd __attribute__ ((unused)),
 	  const char *newpath __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -158,7 +170,8 @@
 	size_t len __attribute__ ((unused)),
 	unsigned int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -169,7 +182,8 @@
  size_t len __attribute__ ((unused)),
  unsigned int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -179,7 +193,8 @@
 	  const char *pathname __attribute__ ((unused)),
 	  int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif
 
@@ -187,6 +202,7 @@
 int
 unshare (int flags __attribute__ ((unused)))
 {
-  return ENOSYS;
+  errno = ENOSYS;
+  return -1;
 }
 #endif


Merge from trunk to gccgo branch

2012-02-02 Thread Ian Lance Taylor
I've merged trunk revision 183852 to the gccgo branch.

Ian


libgo patch committed: Fix type of last field of Cmsghdr

2012-02-02 Thread Ian Lance Taylor
This patch to libgo fixes the type of the last field of Cmsghdr from
[]byte to [0]byte.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r abc9201b3ab0 libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Thu Feb 02 14:58:22 2012 -0800
+++ b/libgo/mksysinfo.sh	Thu Feb 02 17:04:33 2012 -0800
@@ -507,6 +507,7 @@
 -e 's/cmsg_len *[a-zA-Z0-9_]*/Len Cmsghdr_len_t/' \
 -e 's/cmsg_level/Level/' \
 -e 's/cmsg_type/Type/' \
+-e 's/\[\]/[0]/' \
   >> ${OUT}
 
   # The size of the cmsghdr struct.


RE:PING: [PATCH, ARM, iWMMXt][1/5]: ARM code generic change

2012-02-02 Thread Xinyu Qi
PING

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01787.html

At 2011-12-29 14:20:20,"Xinyu Qi"  wrote:
> > At 2011-12-15 00:47:48,"Richard Earnshaw"  wrote:
> > > On 14/07/11 08:35, Xinyu Qi wrote:
> > > >>> Hi,
> > > >>>
> > > >>> It is the first part of iWMMXt maintenance.
> > > >>>
> > > >>> *config/arm/arm.c (arm_option_override):
> > > >>>   Enable iWMMXt with VFP. iWMMXt and NEON are incompatible.
> > > >> iWMMXt unsupported under Thumb-2 mode.
> > > >>>   (arm_expand_binop_builtin): Accept immediate op (with mode
> > > >>> VOID)
> > > >>> *config/arm/arm.md:
> > > >>>   Resettle include location of iwmmxt.md so that *arm_movdi
> > > >> and *arm_movsi_insn could be used when iWMMXt is enabled.
> > > >>
> > > >> With the current work in trunk to handle enabled attributes and
> > > >> per-alternative predicable attributes (Thanks Bernd) we should be
> > > >> able to get rid of *cond_iwmmxt_movsi_insn"  in iwmmxt.md file.
> > > >> It's not a matter for this patch but for a follow-up patch.
> > > >>
> > > >> Actually we should probably do the same for the various insns
> > > >> that are dotted around all over the place with final conditions
> > > >> that prevent matching - atleast makes the backend description
> > > >> slightly smaller :).
> > > >>
> > > >>>   Add pipeline description file include.
> > > >>
> > > >> It is enough to say
> > > >>
> > > >>  (): Include.
> > > >>
> > > >> in the changelog entry.
> > > >>
> > > >> The include for the pipeline description file should be with the
> > > >> patch that you add this in i.e. patch #5. Please add this to
> > > >> MD_INCLUDES in t-arm as well.
> > > >>
> > > >> Also as a general note, please provide a correct Changelog entry.
> > > >>
> > > >> This is not the format that we expect Changelog entries to be in.
> > > >> Please look at the coding standards on the website for this or at
> > > >> other patches submitted with respect to Changelog entries. Please
> > > >> fix this for each patch in the patch stack.
> > > >>
> > > >>
> > > >> cheers
> > > >> Ramana
> > > >
> > > > Thanks for reviewing. I have updated the patches and the Changelog.
> > > >
> > > > *config/arm/arm.c (arm_option_override): Enable iWMMXt with VFP.
> > > >  (arm_expand_binop_builtin): Accept VOIDmode op.
> > > > *config/arm/arm.md (*arm_movdi, *arm_movsi_insn): Remove
> > > condition !TARGET_IWMMXT.
> > > >  (iwmmxt.md): Include location.
> > > >
> > > > Thanks,
> > > > Xinyu=
> > > >
> > >
> > > + VFP and iWMMXt however can coexist.  */  if (TARGET_IWMMXT
> > &&
> > > + TARGET_HARD_FLOAT && !TARGET_VFP)
> > > +sorry ("iWMMXt and non-VFP floating point unit");
> > > +
> > > +  /* iWMMXt and NEON are incompatible.  */  if (TARGET_IWMMXT
> &&
> > > + TARGET_NEON)
> > > +sorry ("iWMMXt and NEON");
> > >
> > > -  /* ??? iWMMXt insn patterns need auditing for Thumb-2.  */
> > > +  /* iWMMXt unsupported under Thumb-2 mode.  */
> > >if (TARGET_THUMB2 && TARGET_IWMMXT)
> > >  sorry ("Thumb-2 iWMMXt");
> > >
> > > Don't use sorry() when a feature is not supported by the hardware;
> > > sorry() is used when GCC is currently unable to support something
> > > that it should.  Use error() in these cases.
> > >
> > > Secondly, iWMMXt is incompatible with the entire Thumb ISA, not just
> > > the
> > > Thumb-2 extensions to the Thumb ISA.
> >
> > Done.
> >
> > >
> > >
> > > +;; Load the Intel Wireless Multimedia Extension patterns (include
> > > +"iwmmxt.md")
> > > +
> > >
> > >
> > > No, the extension patterns need to come at the end of the main
> > > machine description.  The list at the top of the MD file is purely
> > > for pipeline descriptions.  Why do you think this is needed?
> >
> > This modification is needless right now since *iwmmxt_movsi_insn and
> > *iwmmxt_arm_movdi have been corrected in the fourth part of the patch.
> > Revert it.
> > The new modified patch is attached.
> >
> > * config/arm/arm.c (arm_option_override): Enable use of iWMMXt with
> > VFP.
> > Disable use of iWMMXt with NEON. Disable use of iWMMXt under Thumb
> > mode.
> > (arm_expand_binop_builtin): Accept VOIDmode op.
> >
> > Thanks,
> > Xinyu
> >
> > >
> > > Other bits are ok.
> > >
> > > R.
> 
> New changlog
> 
>   * config/arm/arm.c (FL_IWMMXT2): New define.
>   (arm_arch_iwmmxt2): New variable.
>   (arm_option_override): Enable use of iWMMXt with VFP.
>   Disable use of iWMMXt with NEON. Disable use of iWMMXt under Thumb
> mode.
>   Set arm_arch_iwmmxt2.
>   (arm_expand_binop_builtin): Accept VOIDmode op.
>   * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define
> __IWMMXT2__.
>   (TARGET_IWMMXT2): New define.
>   (TARGET_REALLY_IWMMXT2): Likewise.
>   (arm_arch_iwmmxt2): Declare.
>   * config/arm/arm-cores.def (iwmmxt2): Add FL_IWMMXT2.
>   * config/arm/arm-arches.def (iwmmxt2): Likewise.
>   * config/arm/arm.md (arch): Add "iwmmxt2".
>   (arch_enabled): Handle "iwmmxt2".
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][2/5]: intrinsic head file change

2012-02-02 Thread Xinyu Qi
PING

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01788.html

At 2011-12-29 14:22:50,"Xinyu Qi"  wrote:
>   * config/arm/mmintrin.h: Use __IWMMXT__ to enable iWMMXt
> intrinsics.
>   Use __IWMMXT2__ to enable iWMMXt2 intrinsics.
>   Use C name-mangling for intrinsics.
>   (__v8qi): Redefine.
>   (_mm_cvtsi32_si64, _mm_andnot_si64, _mm_sad_pu8): Revise.
>   (_mm_sad_pu16, _mm_align_si64, _mm_setwcx, _mm_getwcx):
> Likewise.
>   (_m_from_int): Likewise.
>   (_mm_sada_pu8, _mm_sada_pu16): New intrinsic.
>   (_mm_alignr0_si64, _mm_alignr1_si64, _mm_alignr2_si64): Likewise.
>   (_mm_alignr3_si64, _mm_tandcb, _mm_tandch, _mm_tandcw): Likewise.
>   (_mm_textrcb, _mm_textrch, _mm_textrcw, _mm_torcb): Likewise.
>   (_mm_torch, _mm_torcw, _mm_tbcst_pi8, _mm_tbcst_pi16): Likewise.
>   (_mm_tbcst_pi32): Likewise.
>   (_mm_abs_pi8, _mm_abs_pi16, _mm_abs_pi32): New iWMMXt2
> intrinsic.
>   (_mm_addsubhx_pi16, _mm_absdiff_pu8, _mm_absdiff_pu16): Likewise.
>   (_mm_absdiff_pu32, _mm_addc_pu16, _mm_addc_pu32): Likewise.
>   (_mm_avg4_pu8, _mm_avg4r_pu8, _mm_maddx_pi16,
> _mm_maddx_pu16): Likewise.
>   (_mm_msub_pi16, _mm_msub_pu16, _mm_mulhi_pi32): Likewise.
>   (_mm_mulhi_pu32, _mm_mulhir_pi16, _mm_mulhir_pi32): Likewise.
>   (_mm_mulhir_pu16, _mm_mulhir_pu32, _mm_mullo_pi32): Likewise.
>   (_mm_qmulm_pi16, _mm_qmulm_pi32, _mm_qmulmr_pi16): Likewise.
>   (_mm_qmulmr_pi32, _mm_subaddhx_pi16, _mm_addbhusl_pu8):
> Likewise.
>   (_mm_addbhusm_pu8, _mm_qmiabb_pi32, _mm_qmiabbn_pi32):
> Likewise.
>   (_mm_qmiabt_pi32, _mm_qmiabtn_pi32, _mm_qmiatb_pi32): Likewise.
>   (_mm_qmiatbn_pi32, _mm_qmiatt_pi32, _mm_qmiattn_pi32): Likewise.
>   (_mm_wmiabb_si64, _mm_wmiabbn_si64, _mm_wmiabt_si64): Likewise.
>   (_mm_wmiabtn_si64, _mm_wmiatb_si64, _mm_wmiatbn_si64):
> Likewise.
>   (_mm_wmiatt_si64, _mm_wmiattn_si64, _mm_wmiawbb_si64):
> Likewise.
>   (_mm_wmiawbbn_si64, _mm_wmiawbt_si64, _mm_wmiawbtn_si64):
> Likewise.
>   (_mm_wmiawtb_si64, _mm_wmiawtbn_si64, _mm_wmiawtt_si64):
> Likewise.
>   (_mm_wmiawttn_si64, _mm_merge_si64): Likewise.
>   (_mm_torvscb, _mm_torvsch, _mm_torvscw): Likewise.
>   (_m_to_int): New define.
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][3/5]: built in define and expand

2012-02-02 Thread Xinyu Qi
PING

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01789.html

At 2011-12-29 14:25:23,"Xinyu Qi"  wrote:
> > At 2011-11-24 09:27:04,"Xinyu Qi"  wrote:
> > > At 2011-11-19 07:08:22,"Ramana Radhakrishnan"
> > >  wrote:
> > > > On 20 October 2011 08:39, Xinyu Qi  wrote:
> > > > > Ping
> > > > >
> > > > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01103.html
> > > > >
> > > > >        * config/arm/arm.c (enum arm_builtins): Revise built-in fcode.
> > > > >        (builtin_description bdesc_2arg): Revise built in declaration.
> > > > >        (builtin_description bdesc_1arg): Likewise.
> > > > >        (arm_init_iwmmxt_builtins): Revise built in initialization.
> > > > >        (arm_expand_builtin): Revise built in expansion.
> > > > >
> > > >
> > > > This currently doesn't apply - can you take a look ?
> > >
> > > Hi Ramana,
> > >
> > > I resolve the patch conflict with the newest trunk gcc. The resolved
> > > diff is attached.
> > >
> > > Thanks,
> > > Xinyu
> >
> > Update the built in expand. Remove some redundant code.
> > New diff is attached.
> >
> > Thanks,
> > Xinyu
> 
>   * config/arm/arm.c (enum arm_builtins): Revise built-in fcode.
>   (IWMMXT2_BUILTIN): New define.
>   (IWMMXT2_BUILTIN2): Likewise.
>   (iwmmx2_mbuiltin): Likewise.
>   (builtin_description bdesc_2arg): Revise built in declaration.
>   (builtin_description bdesc_1arg): Likewise.
>   (arm_init_iwmmxt_builtins): Revise built in initialization.
>   (arm_expand_builtin): Revise built in expansion.
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][4/5]: WMMX machine description

2012-02-02 Thread Xinyu Qi
PING

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01786.html

At 2011-12-29 14:12:44,"Xinyu Qi"  wrote:
> At 2011-12-22 17:53:45,"Richard Earnshaw"  wrote:
> > On 22/12/11 06:38, Xinyu Qi wrote:
> > > At 2011-12-15 01:32:13,"Richard Earnshaw"  wrote:
> > >> On 24/11/11 01:33, Xinyu Qi wrote:
> > >>> Hi Ramana,
> > >>>
> > >>> I solve the conflict, please try again. The new diff is attached.
> > >>>
> > >>> Thanks,
> > >>> Xinyu
> > >>>
> > >>> At 2011-11-19 07:36:15,"Ramana Radhakrishnan"
> > >>  wrote:
> > 
> >  Hi Xinyu,
> > 
> >  This doesn't apply cleanly currently on trunk and the reject
> >  appears to come from iwmmxt.md and I've not yet investigated why.
> > 
> >  Can you have a look ?
> > 
> > >>
> > >> This patch is NOT ok.
> > >>
> > >> You're adding features that were new in iWMMXt2 (ie not in the
> > >> original
> > >> implementation) but you've provided no means by which the compiler
> > >> can detect which operations are only available on the new cores.
> > >
> > > Hi Richard,
> > >
> > > All of the WMMX chips support WMMX2 instructions.
> >
> > This may be true for Marvell's current range of processors, but I find
> > it hard to reconcile with the assembler support in GAS, which clearly
> > distinguishes between iWMMXT and iWMMXT2 instruction sets.  Are you
> > telling me that no cores were ever manufactured (even by Intel) that
> > only supported iWMMXT?
> >
> > I'm concerned that this patch will break support for existing users
> > who have older chips (for GCC we have to go through a deprecation
> > cycle if we want to drop support for something we now believe is
> > no-longer worth maintaining).
> >
> > > What I do is to complement the WMMX2 intrinsic support in GCC.
> >
> > I understand that, and I'm not saying the patch can never go in; just
> > that it needs to separate out the support for the different
> > architecture variants.
> >
> > > I don't think it is necessary for users to consider whether one WMMX
> > > insn is a
> > WMMX2 insn or not.
> >
> > Users don't (unless they want their code to run on legacy processors
> > that only support the original instruction set), but the compiler
> > surely must know what it is targeting.  Remember that the instruction
> > patterns are not entirely black boxes, the compiler can do
> > optimizations on intrinsics (it's one of the reasons why they are
> > better than inline assembly).  Unless the compiler knows exactly what
> > instructions are legal, it could end up optimizing something that
> > started as a WMMX insn into something that's a WMMX2 insn (for
> > example, propagating a constant into a vector shift expression).
> >
> > R.
> 
> Hi, Richard,
> 
> You are right. There exist the chips that only support WMMX instructions in 
> the
> history.
> I distinguish the iWMMXt and iWMMXt2 in the patch update this time.
> 
> In current GCC, -march=iwmmxt and -march=iwmmxt2 (or -mcpu=iwmmxt and
> -mcpu=iwmmxt2) are almost no difference in the compiling stage.
> I take advantage of them to do the work, that is, make -march=iwmmxt (or
> -mcpu=iwmmxt) only support iWMMXt intrinsic iWMMXt built in and WMMX
> instructions, and make -march=iwmmxt2 (or -mcpu=iwmmxt2) support fully
> iWMMXt2.
> 
> Define a new flag FL_IWMMXT2 to represent the chip support iWMMXt2
> extension, which directly controls the iWMMXt2 built in initialization and the
> followed defines.
> Define __IWMMXT2__ in TARGET_CPU_CPP_BUILTINS to control the access of
> iWMMXt2 intrinsics.
> Define TARGET_REALLY_IWMMXT2 to control the access of WMMX2
> instructions' machine description.
> In arm.md, define iwmmxt2 in "arch" attr to control the access of the
> alternative in shift patterns.
> 
> The updated patch 4/5 is attached here. 1/5, 2/5 and 3/5 are updated
> accordingly. Attach them in related mails.
> Please take a look if such modification is proper.
> 
> Changelog:
> 
>   * config/arm/arm.c (arm_output_iwmmxt_shift_immediate): New
> function.
>   (arm_output_iwmmxt_tinsr): Likewise.
>   * config/arm/arm-protos.h (arm_output_iwmmxt_shift_immediate):
> Declare.
>   (arm_output_iwmmxt_tinsr): Likewise.
>   * config/arm/iwmmxt.md (WCGR0, WCGR1, WCGR2, WCGR3): New
> constant.
>   (iwmmxt_psadbw, iwmmxt_walign, iwmmxt_tmrc, iwmmxt_tmcr):
> Delete.
>   (rorv4hi3, rorv2si3, rordi3): Likewise.
>   (rorv4hi3_di, rorv2si3_di, rordi3_di): Likewise.
>   (ashrv4hi3_di, ashrv2si3_di, ashrdi3_di): Likewise.
>   (lshrv4hi3_di, lshrv2si3_di, lshrdi3_di): Likewise.
>   (ashlv4hi3_di, ashlv2si3_di, ashldi3_di): Likewise.
>   (iwmmxt_tbcstqi, iwmmxt_tbcsthi, iwmmxt_tbcstsi): Likewise
>   (*iwmmxt_clrv8qi, *iwmmxt_clrv4hi, *iwmmxt_clrv2si): Likewise.
>   (tbcstv8qi, tbcstv4hi, tbsctv2si): New pattern.
>   (iwmmxt_clrv8qi, iwmmxt_clrv4hi, iwmmxt_clrv2si): Likewise.
>   (*and3_iwmmxt, *ior3_iwmmxt, *xor3_iwmmxt):
> Likewise.
>   (ror3, ror3_di): Likewise.
>   (ashr3_di, lshr3_di, ashl3_di): Likewi

[google][4.6]Bug fix to function reordering linker plugin (issue5623048)

2012-02-02 Thread Sriraman Tallam
Fix a bug in the function reordering linker plugin where the number of nodes
to be reordered is incremented in the wrong place. This caused a heap buffer
to overflow under certain conditions.  

The linker plugin itself is only available in the google 4_6 branch and I will
port it to other branches and make it available for review for trunk soon.

* callgraph.c (parse_callgraph_section_contents): Remove increment
to num_real_nodes.
(set_node_type): Increment num_real_nodes.

Index: function_reordering_plugin/callgraph.c
===
--- function_reordering_plugin/callgraph.c  (revision 183860)
+++ function_reordering_plugin/callgraph.c  (working copy)
@@ -304,7 +304,6 @@ parse_callgraph_section_contents (unsigned char *s
   caller = caller + HEADER_LEN;
   curr_length = read_length;
   caller_node = get_function_node (caller);
-  num_real_nodes++;
 
   while (curr_length < length)
 {
@@ -422,7 +421,10 @@ static void set_node_type (Node *n)
   char *name = n->name;
   slot = htab_find_with_hash (section_map, name, htab_hash_string (name));
   if (slot != NULL)
-set_as_real_node (n);
+{
+  set_as_real_node (n);
+  num_real_nodes++;
+}
 }
 
 void

--
This patch is available for review at http://codereview.appspot.com/5623048


[Committed] Fix PR 47982 and PR 43967: __udivmod*

2012-02-02 Thread Andrew Pinski
Hi,
  This fixes the documentation aboud __udivmoddi4 and __udivmodti4.
There was a copy and paste error for both of these functions.
__udivmoddi3 was used instead of __udivmoddi4 and __udivti3 instead of
__udivmodti4.

Committed as obvious after double checking that these are the name of
the functions are emitted by optabs.c

Thanks,
Andrew Pinski

ChangeLog:
* doc/libgcc.texi (__udivmoddi4/__udivmodti4): Fix documentation typo.
Index: doc/libgcc.texi
===
--- doc/libgcc.texi (revision 183861)
+++ doc/libgcc.texi (working copy)
@@ -106,8 +106,8 @@ These functions return the quotient of t
 and @var{b}.
 @end deftypefn
 
-@deftypefn {Runtime Function} {unsigned long} __udivmoddi3 (unsigned long 
@var{a}, unsigned long @var{b}, unsigned long *@var{c})
-@deftypefnx {Runtime Function} {unsigned long long} __udivti3 (unsigned long 
long @var{a}, unsigned long long @var{b}, unsigned long long *@var{c})
+@deftypefn {Runtime Function} {unsigned long} __udivmoddi4 (unsigned long 
@var{a}, unsigned long @var{b}, unsigned long *@var{c})
+@deftypefnx {Runtime Function} {unsigned long long} __udivmodti4 (unsigned 
long long @var{a}, unsigned long long @var{b}, unsigned long long *@var{c})
 These functions calculate both the quotient and remainder of the unsigned
 division of @var{a} and @var{b}.  The return value is the quotient, and
 the remainder is placed in variable pointed to by @var{c}.


[PATCH] fix PR51910, take 2

2012-02-02 Thread Sandra Loosemore
Here is another attempt to fix the bad interaction between 
--with-demangler-in-ld and -frepo processing.


This version of the patch always disables demangling in ld when 
repository files are present for the link.  If demangling is requested 
(either by an explicit -Wl,--demangle option, by using 
COLLECT_NO_DEMANGLE, or just doing nothing and getting the default), 
it'll be performed by having collect2 filter the output streams no 
matter how you configured --with-demangler-in-ld.  The exception is if 
-Wl,-Map is also provided, in which case it falls back on re-running the 
final link with the correct options to get a demangled map file after it 
is done with the repo file processing.


I could have implemented that last bit by having collect2 intercept the 
map file option and filter the file to demangle it, and make that the 
default behavior when collect2 is otherwise doing demangling itself. 
Given the previous argument that users can't possibly have any reason to 
want a demangled link map, I thought there might be objection to 
polluting collect2 by hacking it up to do just that.  I mean, it's 
horrible enough that ld produces a demangled map file when you request 
demangled output  :-P


Anyway, thoughts on this patch?  It bootstrapped and regression tested 
OK on i686-linux, and I hand-tested it on every permutation of 
mangling/repository/map options I could think of.


-Sandra


2012-02-02  Sandra Loosemore 
Jason Merrill 
Jakub Jelinek 

PR c++/51910
gcc/
* collect2.c (dump_file): Add demangle_in_ld parameter so this
can be tested dynamically instead only at compile-time.  Update
callers.
(main): Initialize no_demangle even when HAVE_LD_DEMANGLE.
* collect2.h (dump_file): Adjust declaration.
(no_demangle): Delare as extern.
(DEMANGLE_IN_LD): Define.
* tlink.c (do_tlink): Explicitly pass --no-demangle to linker
for repo processing when HAVE_LD_DEMANGLE is defined, and demangle
in collect2 instead of the linker.

gcc/testsuite/
* g++.dg/torture/pr51910.C: New testcase.


Index: gcc/collect2.c
===
--- gcc/collect2.c	(revision 183674)
+++ gcc/collect2.c	(working copy)
@@ -403,13 +403,13 @@ collect_exit (int status)
 
   if (ldout != 0 && ldout[0])
 {
-  dump_file (ldout, stdout);
+  dump_file (ldout, stdout, DEMANGLE_IN_LD);
   maybe_unlink (ldout);
 }
 
   if (lderrout != 0 && lderrout[0])
 {
-  dump_file (lderrout, stderr);
+  dump_file (lderrout, stderr, DEMANGLE_IN_LD);
   maybe_unlink (lderrout);
 }
 
@@ -515,7 +515,7 @@ extract_string (const char **pp)
 }
 
 void
-dump_file (const char *name, FILE *to)
+dump_file (const char *name, FILE *to, int demangle_in_ld)
 {
   FILE *stream = fopen (name, "r");
 
@@ -540,14 +540,10 @@ dump_file (const char *name, FILE *to)
 	  if (!strncmp (p, USER_LABEL_PREFIX, strlen (USER_LABEL_PREFIX)))
 	p += strlen (USER_LABEL_PREFIX);
 
-#ifdef HAVE_LD_DEMANGLE
-	  result = 0;
-#else
-	  if (no_demangle)
+	  if (no_demangle || demangle_in_ld)
 	result = 0;
 	  else
 	result = cplus_demangle (p, DMGL_PARAMS | DMGL_ANSI | DMGL_VERBOSE);
-#endif
 
 	  if (result)
 	{
@@ -1114,9 +1110,8 @@ main (int argc, char **argv)
 
   num_c_args = argc + 9;
 
-#ifndef HAVE_LD_DEMANGLE
   no_demangle = !! getenv ("COLLECT_NO_DEMANGLE");
-
+#ifndef HAVE_LD_DEMANGLE
   /* Suppress demangling by the real linker, which may be broken.  */
   putenv (xstrdup ("COLLECT_NO_DEMANGLE=1"));
 #endif
@@ -1566,15 +1561,14 @@ main (int argc, char **argv)
 	case '-':
 	  if (strcmp (arg, "--no-demangle") == 0)
 		{
-#ifndef HAVE_LD_DEMANGLE
 		  no_demangle = 1;
+#ifndef HAVE_LD_DEMANGLE
 		  ld1--;
 		  ld2--;
 #endif
 		}
 	  else if (strncmp (arg, "--demangle", 10) == 0)
 		{
-#ifndef HAVE_LD_DEMANGLE
 		  no_demangle = 0;
 		  if (arg[10] == '=')
 		{
@@ -1585,6 +1579,7 @@ main (int argc, char **argv)
 		  else
 			current_demangling_style = style;
 		}
+#ifndef HAVE_LD_DEMANGLE
 		  ld1--;
 		  ld2--;
 #endif
Index: gcc/collect2.h
===
--- gcc/collect2.h	(revision 183674)
+++ gcc/collect2.h	(working copy)
@@ -30,7 +30,7 @@ extern void collect_exit (int) ATTRIBUTE
 
 extern int collect_wait (const char *, struct pex_obj *);
 
-extern void dump_file (const char *, FILE *);
+extern void dump_file (const char *, FILE *, int);
 
 extern int file_exists (const char *);
 
@@ -40,6 +40,13 @@ extern const char *c_file_name;
 extern struct obstack temporary_obstack;
 extern char *temporary_firstobj;
 extern bool vflag, debug;
+extern int no_demangle;
+
+#ifdef HAVE_LD_DEMANGLE
+#define DEMANGLE_IN_LD 1
+#else
+#define DEMANGLE_IN_LD 0
+#endif
 
 extern void notice_translated (const char *, ...) ATTRIBUTE_PRINTF_1;
 extern void notice (const char *, ...) AT

Re: [google][4.6]Bug fix to function reordering linker plugin (issue5623048)

2012-02-02 Thread Xinliang David Li
This code before the change seems to over-estimate the number of real
nodes which should be safe -- can you explain why it causes problem?

David

On Thu, Feb 2, 2012 at 6:13 PM, Sriraman Tallam  wrote:
> Fix a bug in the function reordering linker plugin where the number of nodes
> to be reordered is incremented in the wrong place. This caused a heap buffer
> to overflow under certain conditions.
>
> The linker plugin itself is only available in the google 4_6 branch and I will
> port it to other branches and make it available for review for trunk soon.
>
>        * callgraph.c (parse_callgraph_section_contents): Remove increment
>        to num_real_nodes.
>        (set_node_type): Increment num_real_nodes.
>
> Index: function_reordering_plugin/callgraph.c
> ===
> --- function_reordering_plugin/callgraph.c      (revision 183860)
> +++ function_reordering_plugin/callgraph.c      (working copy)
> @@ -304,7 +304,6 @@ parse_callgraph_section_contents (unsigned char *s
>   caller = caller + HEADER_LEN;
>   curr_length = read_length;
>   caller_node = get_function_node (caller);
> -  num_real_nodes++;
>
>   while (curr_length < length)
>     {
> @@ -422,7 +421,10 @@ static void set_node_type (Node *n)
>   char *name = n->name;
>   slot = htab_find_with_hash (section_map, name, htab_hash_string (name));
>   if (slot != NULL)
> -    set_as_real_node (n);
> +    {
> +      set_as_real_node (n);
> +      num_real_nodes++;
> +    }
>  }
>
>  void
>
> --
> This patch is available for review at http://codereview.appspot.com/5623048


MAINTAINERS: add myself

2012-02-02 Thread Jayant R. Sonar
Committed.

   * MAINTAINERS (Write After Approval): Add myself.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 183832)
+++ MAINTAINERS (working copy)
@@ -495,6 +495,7 @@
 Franz Sirl franz.sirl-ker...@lauterbach.com
 Jan Sjodin jan.sjo...@amd.com
 Edward Smith-Rowland   3dw...@verizon.net
+Jayant Sonar   jayant.so...@kpitcummins.com
 Michael Sokolovmsoko...@ivan.harhan.org
 Richard Stallman   r...@gnu.org
 Basile Starynkevitch   bas...@starynkevitch.net




RE: [PATCH] [MIPS] fix mips_prepend insn.

2012-02-02 Thread Fu, Chao-Ying
Richard Sandiford [rdsandif...@googlemail.com] wrote:

> This pattern maps directly to __builtin_mips_prepend, which is defined
> to take and return an SI type, so :SI is the correct choice here.
>
> I agree it might be nice to have a function that operates on 64-bit
> values for 64-bit targets though.  For compatibility reasons,
> we'd probably have to define both a new function and a new pattern
> (or at least an iterator version of the current pattern).
> There's currently no way of generating PREPENDD or PREPENDW either.
>
> I consider this API to be owned by MTI, so we'd need to coordinate
> with them.  Chao-Ying, what do you think?  Do you already have
> something like this internally?

  As there is no MIPS64 core with dsp/dspr2 in the market, we haven't 
tested/ported these built-in functions to MIPS64.

  I think it's better to keep the current prototype of __builtin_mips_prepend 
to return an SI type.
We may need to add new built-in functions for instructions involving 
accumulators and for new MIPS64 dsp/dspr2 instructions to MIPS64.
For existing built-in functions involving only integer registers/immediates, we 
may not change them.
For Liu, is this solution suitable for you?   Thanks!

Regards,
Chao-ying


Re: [google][4.6]Bug fix to function reordering linker plugin (issue5623048)

2012-02-02 Thread Sriraman Tallam
Hi David,

A .gnu.callgraph.text section for function foo will only contain edges
where foo is the caller. Also, in my code real nodes correspond to
functions whose text section is available and hence reorderable.

Originally, when I was counting the number of real function nodes, I
was only treating those functions with a .gnu.callgraph.text section.
This is correct but there can be other real function nodes too so this
is an under-estimate. For instance, there can be leaf functions which
would have no callgraph sections since they dont call anything but can
be reordered. I was not counting these as real function nodes but I
was marking them later as real.  So, the count and the actual can
differ. I use the count to malloc a buffer which gets underallocated
and overflows.

Now, I have changed the code now to increment the number of real
function nodes at the point where I mark a node as real and there is
only one place where I mark it. Hence, this is safe.

Thanks,
-Sri.



On Thu, Feb 2, 2012 at 8:23 PM, Xinliang David Li  wrote:
> This code before the change seems to over-estimate the number of real
> nodes which should be safe -- can you explain why it causes problem?
>
> David
>
> On Thu, Feb 2, 2012 at 6:13 PM, Sriraman Tallam  wrote:
>> Fix a bug in the function reordering linker plugin where the number of nodes
>> to be reordered is incremented in the wrong place. This caused a heap buffer
>> to overflow under certain conditions.
>>
>> The linker plugin itself is only available in the google 4_6 branch and I 
>> will
>> port it to other branches and make it available for review for trunk soon.
>>
>>        * callgraph.c (parse_callgraph_section_contents): Remove increment
>>        to num_real_nodes.
>>        (set_node_type): Increment num_real_nodes.
>>
>> Index: function_reordering_plugin/callgraph.c
>> ===
>> --- function_reordering_plugin/callgraph.c      (revision 183860)
>> +++ function_reordering_plugin/callgraph.c      (working copy)
>> @@ -304,7 +304,6 @@ parse_callgraph_section_contents (unsigned char *s
>>   caller = caller + HEADER_LEN;
>>   curr_length = read_length;
>>   caller_node = get_function_node (caller);
>> -  num_real_nodes++;
>>
>>   while (curr_length < length)
>>     {
>> @@ -422,7 +421,10 @@ static void set_node_type (Node *n)
>>   char *name = n->name;
>>   slot = htab_find_with_hash (section_map, name, htab_hash_string (name));
>>   if (slot != NULL)
>> -    set_as_real_node (n);
>> +    {
>> +      set_as_real_node (n);
>> +      num_real_nodes++;
>> +    }
>>  }
>>
>>  void
>>
>> --
>> This patch is available for review at http://codereview.appspot.com/5623048


Re: [google][4.6]Bug fix to function reordering linker plugin (issue5623048)

2012-02-02 Thread Xinliang David Li
Right -- I examined how the arrays are used. The fix looks safe.

Ok for google branches.

David

On Thu, Feb 2, 2012 at 9:23 PM, Sriraman Tallam  wrote:
> Hi David,
>
> A .gnu.callgraph.text section for function foo will only contain edges
> where foo is the caller. Also, in my code real nodes correspond to
> functions whose text section is available and hence reorderable.
>
> Originally, when I was counting the number of real function nodes, I
> was only treating those functions with a .gnu.callgraph.text section.
> This is correct but there can be other real function nodes too so this
> is an under-estimate. For instance, there can be leaf functions which
> would have no callgraph sections since they dont call anything but can
> be reordered. I was not counting these as real function nodes but I
> was marking them later as real.  So, the count and the actual can
> differ. I use the count to malloc a buffer which gets underallocated
> and overflows.
>
> Now, I have changed the code now to increment the number of real
> function nodes at the point where I mark a node as real and there is
> only one place where I mark it. Hence, this is safe.
>
> Thanks,
> -Sri.
>
>
>
> On Thu, Feb 2, 2012 at 8:23 PM, Xinliang David Li  wrote:
>> This code before the change seems to over-estimate the number of real
>> nodes which should be safe -- can you explain why it causes problem?
>>
>> David
>>
>> On Thu, Feb 2, 2012 at 6:13 PM, Sriraman Tallam  wrote:
>>> Fix a bug in the function reordering linker plugin where the number of nodes
>>> to be reordered is incremented in the wrong place. This caused a heap buffer
>>> to overflow under certain conditions.
>>>
>>> The linker plugin itself is only available in the google 4_6 branch and I 
>>> will
>>> port it to other branches and make it available for review for trunk soon.
>>>
>>>        * callgraph.c (parse_callgraph_section_contents): Remove increment
>>>        to num_real_nodes.
>>>        (set_node_type): Increment num_real_nodes.
>>>
>>> Index: function_reordering_plugin/callgraph.c
>>> ===
>>> --- function_reordering_plugin/callgraph.c      (revision 183860)
>>> +++ function_reordering_plugin/callgraph.c      (working copy)
>>> @@ -304,7 +304,6 @@ parse_callgraph_section_contents (unsigned char *s
>>>   caller = caller + HEADER_LEN;
>>>   curr_length = read_length;
>>>   caller_node = get_function_node (caller);
>>> -  num_real_nodes++;
>>>
>>>   while (curr_length < length)
>>>     {
>>> @@ -422,7 +421,10 @@ static void set_node_type (Node *n)
>>>   char *name = n->name;
>>>   slot = htab_find_with_hash (section_map, name, htab_hash_string (name));
>>>   if (slot != NULL)
>>> -    set_as_real_node (n);
>>> +    {
>>> +      set_as_real_node (n);
>>> +      num_real_nodes++;
>>> +    }
>>>  }
>>>
>>>  void
>>>
>>> --
>>> This patch is available for review at http://codereview.appspot.com/5623048


Re: [PATCH] [MIPS] fix mips_prepend insn.

2012-02-02 Thread Liu
On Fri, Feb 3, 2012 at 12:40 PM, Fu, Chao-Ying  wrote:
> Richard Sandiford [rdsandif...@googlemail.com] wrote:
>
>> This pattern maps directly to __builtin_mips_prepend, which is defined
>> to take and return an SI type, so :SI is the correct choice here.
>>
>> I agree it might be nice to have a function that operates on 64-bit
>> values for 64-bit targets though.  For compatibility reasons,
>> we'd probably have to define both a new function and a new pattern
>> (or at least an iterator version of the current pattern).
>> There's currently no way of generating PREPENDD or PREPENDW either.
>>
>> I consider this API to be owned by MTI, so we'd need to coordinate
>> with them.  Chao-Ying, what do you think?  Do you already have
>> something like this internally?
>
>  As there is no MIPS64 core with dsp/dspr2 in the market, we haven't 
> tested/ported these built-in functions to MIPS64.
>
>  I think it's better to keep the current prototype of __builtin_mips_prepend 
> to return an SI type.
> We may need to add new built-in functions for instructions involving 
> accumulators and for new MIPS64 dsp/dspr2 instructions to MIPS64.
> For existing built-in functions involving only integer registers/immediates, 
> we may not change them.
> For Liu, is this solution suitable for you?   Thanks!
>

OK, I get.
But, sorry, my mips64dspr2 patch has be done...
Should I summit it?

> Regards,
> Chao-ying


[PATCH] [MIPS] [GCC4.8] add MIPS64DSPR2 support.

2012-02-02 Thread Liu
Hi all

I've added MIPS64DSPR2 support to gcc.
Please review.

Thanks.


gcc/
2012-02-03  Jia Liu  

       * config/mips/mips-dspr2.md : add MIPS64DSPR2 insns.

   * config/mips/mips-ftypes.def : builtin type for MIPS64DSPR2 insns.

   * config/mips/mips.c: builtins for MIPS64DSPR2 insns.

   * doc/extend.texi : builtins protos for MIPS64DSPR2 insns.

gcc/testsuite/
2012-02-03  Jia Liu  

   * testsuite/gcc.target/mips/mips64-dspr2.c : New test.
From b21e65c85d87694fb8bbc24f0db7b9e56c378fd1 Mon Sep 17 00:00:00 2001
From: Jia Liu 
Date: Fri, 3 Feb 2012 14:50:44 +0800
Subject: [PATCH] add MIPS64DSPR2 support
Content-Type: text/plain; charset="utf-8"

---
 gcc/config/mips/mips-dspr2.md|  348 ++
 gcc/config/mips/mips-ftypes.def  |6 +
 gcc/config/mips/mips.c   |   28 ++
 gcc/doc/extend.texi  |   24 ++
 gcc/testsuite/gcc.target/mips/mips64-dspr2.c |  197 +++
 5 files changed, 603 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/mips64-dspr2.c

diff --git a/gcc/config/mips/mips-dspr2.md b/gcc/config/mips/mips-dspr2.md
index 108f51b..37982bd 100644
--- a/gcc/config/mips/mips-dspr2.md
+++ b/gcc/config/mips/mips-dspr2.md
@@ -20,17 +20,29 @@
 
 (define_c_enum "unspec" [
   UNSPEC_ABSQ_S_QB
+  UNSPEC_ABSQ_S_OB
   UNSPEC_ADDU_PH
+  UNSPEC_ADDU_QH
   UNSPEC_ADDU_S_PH
+  UNSPEC_ADDU_S_QH
   UNSPEC_ADDUH_QB
+  UNSPEC_ADDUH_OB
   UNSPEC_ADDUH_R_QB
+  UNSPEC_ADDUH_R_OB
   UNSPEC_APPEND
+  UNSPEC_DAPPEND
   UNSPEC_BALIGN
+  UNSPEC_DBALIGN
   UNSPEC_CMPGDU_EQ_QB
+  UNSPEC_CMPGDU_EQ_OB
   UNSPEC_CMPGDU_LT_QB
+  UNSPEC_CMPGDU_LT_OB
   UNSPEC_CMPGDU_LE_QB
+  UNSPEC_CMPGDU_LE_OB
   UNSPEC_DPA_W_PH
+  UNSPEC_DPA_W_QH
   UNSPEC_DPS_W_PH
+  UNSPEC_DPS_W_QH
   UNSPEC_MADD
   UNSPEC_MADDU
   UNSPEC_MSUB
@@ -44,16 +56,28 @@
   UNSPEC_MULT
   UNSPEC_MULTU
   UNSPEC_PRECR_QB_PH
+  UNSPEC_PRECR_OB_QH
   UNSPEC_PRECR_SRA_PH_W
+  UNSPEC_PRECR_SRA_QH_PW
   UNSPEC_PRECR_SRA_R_PH_W
+  UNSPEC_PRECR_SRA_R_QH_PW
   UNSPEC_PREPEND
+  UNSPEC_PREPENDD
+  UNSPEC_PREPENDW
   UNSPEC_SHRA_QB
   UNSPEC_SHRA_R_QB
+  UNSPEC_SHRA_OB
+  UNSPEC_SHRA_R_OB
   UNSPEC_SHRL_PH
+  UNSPEC_SHRL_QH
   UNSPEC_SUBU_PH
   UNSPEC_SUBU_S_PH
+  UNSPEC_SUBU_QH
+  UNSPEC_SUBU_S_QH
   UNSPEC_SUBUH_QB
   UNSPEC_SUBUH_R_QB
+  UNSPEC_SUBUH_OB
+  UNSPEC_SUBUH_R_OB
   UNSPEC_ADDQH_PH
   UNSPEC_ADDQH_R_PH
   UNSPEC_ADDQH_W
@@ -82,6 +106,18 @@
   [(set_attr "type"	"arith")
(set_attr "mode"	"SI")])
 
+(define_insn "mips_absq_s_ob"
+  [(parallel
+[(set (match_operand:V8QI 0 "register_operand" "=d")
+	  (unspec:V8QI [(match_operand:V8QI 1 "reg_or_0_operand" "dYG")]
+		   UNSPEC_ABSQ_S_OB))
+ (set (reg:CCDSP CCDSP_OU_REGNUM)
+	  (unspec:CCDSP [(match_dup 1)] UNSPEC_ABSQ_S_OB))])]
+  "ISA_HAS_DSPR2"
+  "absq_s.ob\t%0,%z1"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")])
+
 (define_insn "mips_addu_ph"
   [(parallel
 [(set (match_operand:V2HI 0 "register_operand" "=d")
@@ -94,6 +130,18 @@
   [(set_attr "type"	"arith")
(set_attr "mode"	"SI")])
 
+(define_insn "mips_addu_qh"
+  [(parallel
+[(set (match_operand:V4HI 0 "register_operand" "=d")
+	  (plus:V4HI (match_operand:V4HI 1 "reg_or_0_operand" "dYG")
+		 (match_operand:V4HI 2 "reg_or_0_operand" "dYG")))
+ (set (reg:CCDSP CCDSP_OU_REGNUM)
+	  (unspec:CCDSP [(match_dup 1) (match_dup 2)] UNSPEC_ADDU_QH))])]
+  "ISA_HAS_DSPR2"
+  "addu.qh\t%0,%z1,%z2"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")])
+
 (define_insn "mips_addu_s_ph"
   [(parallel
 [(set (match_operand:V2HI 0 "register_operand" "=d")
@@ -107,6 +155,19 @@
   [(set_attr "type"	"arith")
(set_attr "mode"	"SI")])
 
+(define_insn "mips_addu_s_qh"
+  [(parallel
+[(set (match_operand:V4HI 0 "register_operand" "=d")
+	  (unspec:V4HI [(match_operand:V4HI 1 "reg_or_0_operand" "dYG")
+			(match_operand:V4HI 2 "reg_or_0_operand" "dYG")]
+		   UNSPEC_ADDU_S_QH))
+ (set (reg:CCDSP CCDSP_OU_REGNUM)
+	  (unspec:CCDSP [(match_dup 1) (match_dup 2)] UNSPEC_ADDU_S_QH))])]
+  "ISA_HAS_DSPR2"
+  "addu_s.qh\t%0,%z1,%z2"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")])
+
 (define_insn "mips_adduh_qb"
   [(set (match_operand:V4QI 0 "register_operand" "=d")
 	(unspec:V4QI [(match_operand:V4QI 1 "reg_or_0_operand" "dYG")
@@ -117,6 +178,16 @@
   [(set_attr "type"	"arith")
(set_attr "mode"	"SI")])
 
+(define_insn "mips_adduh_ob"
+  [(set (match_operand:V8QI 0 "register_operand" "=d")
+	(unspec:V8QI [(match_operand:V8QI 1 "reg_or_0_operand" "dYG")
+		  (match_operand:V8QI 2 "reg_or_0_operand" "dYG")]
+		 UNSPEC_ADDUH_OB))]
+  "ISA_HAS_DSPR2"
+  "adduh.ob\t%0,%z1,%z2"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")])
+
 (define_insn "mips_adduh_r_qb"
   [(set (match_operand:V4QI 0 "register_operand" "=d")
 	(unspec:V4QI [(match_operand:V4QI 1 "reg_or_0_operand" "dYG")
@@ -127,6 +198,16 @@
   [(set_attr "type"	"arith")
(set_attr "mode"	"SI")])