[PATCH] Fix cxx_eval_bit_field_ref (PR c++/49136)

2011-05-25 Thread Jakub Jelinek
Hi!

optimize_bit_field_compare during folding can create BIT_FIELD_REFs
that reference more than a single bitfield, then mask the right bits from
it.  The following patch changes cxx_eval_bit_field_ref to be able to read
the multiple fields from the constructor.

Bootstrapped/regtested on x86_64-linux and i686-linux, acked by Jason in
bugzilla, committed to trunk/4.6.

2011-05-25  Jakub Jelinek  ja...@redhat.com

PR c++/49136
* semantics.c (cxx_eval_bit_field_ref): Handle the
case when BIT_FIELD_REF doesn't cover only a single field.

* g++.dg/cpp0x/constexpr-bitfield2.C: New test.
* g++.dg/cpp0x/constexpr-bitfield3.C: New test.

--- gcc/cp/semantics.c.jj   2011-05-20 08:14:06.0 +0200
+++ gcc/cp/semantics.c  2011-05-24 18:57:00.0 +0200
@@ -6442,6 +6442,9 @@ cxx_eval_bit_field_ref (const constexpr_
bool *non_constant_p)
 {
   tree orig_whole = TREE_OPERAND (t, 0);
+  tree retval, fldval, utype, mask;
+  bool fld_seen = false;
+  HOST_WIDE_INT istart, isize;
   tree whole = cxx_eval_constant_expression (call, orig_whole,
 allow_non_constant, addr,
 non_constant_p);
@@ -6462,12 +6465,47 @@ cxx_eval_bit_field_ref (const constexpr_
 return t;
 
   start = TREE_OPERAND (t, 2);
+  istart = tree_low_cst (start, 0);
+  isize = tree_low_cst (TREE_OPERAND (t, 1), 0);
+  utype = TREE_TYPE (t);
+  if (!TYPE_UNSIGNED (utype))
+utype = build_nonstandard_integer_type (TYPE_PRECISION (utype), 1);
+  retval = build_int_cst (utype, 0);
   FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (whole), i, field, value)
 {
-  if (bit_position (field) == start)
+  tree bitpos = bit_position (field);
+  if (bitpos == start  DECL_SIZE (field) == TREE_OPERAND (t, 1))
return value;
+  if (TREE_CODE (TREE_TYPE (field)) == INTEGER_TYPE
+  TREE_CODE (value) == INTEGER_CST
+  host_integerp (bitpos, 0)
+  host_integerp (DECL_SIZE (field), 0))
+   {
+ HOST_WIDE_INT bit = tree_low_cst (bitpos, 0);
+ HOST_WIDE_INT sz = tree_low_cst (DECL_SIZE (field), 0);
+ HOST_WIDE_INT shift;
+ if (bit = istart  bit + sz = istart + isize)
+   {
+ fldval = fold_convert (utype, value);
+ mask = build_int_cst_type (utype, -1);
+ mask = fold_build2 (LSHIFT_EXPR, utype, mask,
+ size_int (TYPE_PRECISION (utype) - sz));
+ mask = fold_build2 (RSHIFT_EXPR, utype, mask,
+ size_int (TYPE_PRECISION (utype) - sz));
+ fldval = fold_build2 (BIT_AND_EXPR, utype, fldval, mask);
+ shift = bit - istart;
+ if (BYTES_BIG_ENDIAN)
+   shift = TYPE_PRECISION (utype) - shift - sz;
+ fldval = fold_build2 (LSHIFT_EXPR, utype, fldval,
+   size_int (shift));
+ retval = fold_build2 (BIT_IOR_EXPR, utype, retval, fldval);
+ fld_seen = true;
+   }
+   }
 }
-  gcc_unreachable();
+  if (fld_seen)
+return fold_convert (TREE_TYPE (t), retval);
+  gcc_unreachable ();
   return error_mark_node;
 }
 
--- gcc/testsuite/g++.dg/cpp0x/constexpr-bitfield2.C.jj 2011-05-24 
14:37:39.0 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-bitfield2.C2011-05-24 
14:36:43.0 +0200
@@ -0,0 +1,19 @@
+// PR c++/49136
+// { dg-do compile }
+// { dg-options -std=c++0x }
+
+struct day
+{
+  unsigned d : 5;
+  unsigned n : 3;
+  constexpr explicit day (int dd) : d(dd), n(7) {}
+};
+
+struct date {
+  int d;
+  constexpr date (day dd) : d(dd.n != 7 ? 7 : dd.d) {}
+};
+
+constexpr day d(0);
+constexpr date dt(d);
+static_assert (dt.d == 0, Error);
--- gcc/testsuite/g++.dg/cpp0x/constexpr-bitfield3.C.jj 2011-05-24 
14:37:43.0 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-bitfield3.C2011-05-24 
14:43:40.0 +0200
@@ -0,0 +1,33 @@
+// PR c++/49136
+// { dg-do compile }
+// { dg-options -std=c++0x }
+
+struct S
+{
+  unsigned : 1; unsigned s : 27; unsigned : 4;
+  constexpr S (unsigned int x) : s(x) {}
+};
+
+template typename S
+struct T
+{
+  unsigned int t;
+  constexpr T (S s) : t(s.s != 7 ? 0 : s.s) {}
+  constexpr T (S s, S s2) : t(s.s != s2.s ? 0 : s.s) {}
+};
+
+constexpr S s (7), s2 (7);
+constexpr TS t (s), t2 (s, s2);
+static_assert (t.t == 7, Error);
+static_assert (t2.t == 7, Error);
+
+struct U
+{
+  int a : 1; int s : 1;
+  constexpr U (int x, int y) : a (x), s (y) {}
+};
+
+constexpr U u (0, -1), u2 (-1, -1);
+constexpr TU t3 (u), t4 (u, u2);
+static_assert (t3.t == 0, Error);
+static_assert (t4.t == -1, Error);

Jakub


[PATCH] Fix a typo in i386 host_detect_local_cpu (PR target/49128)

2011-05-25 Thread Jakub Jelinek
Hi!

Committed as obvious.

2011-05-25  Jakub Jelinek  ja...@redhat.com

PR target/49128
* config/i386/driver-i386.c (host_detect_local_cpu): Fix a typo.

--- gcc/config/i386/driver-i386.c   (revision 174170)
+++ gcc/config/i386/driver-i386.c   (revision 174171)
@@ -696,7 +696,7 @@ const char *host_detect_local_cpu (int a
   const char *bmi = has_bmi ?  -mbmi :  -mno-bmi;
   const char *tbm = has_tbm ?  -mtbm :  -mno-tbm;
   const char *avx = has_avx ?  -mavx :  -mno-avx;
-  const char *sse4_2 = has_sse4_2 ?  -msse4.2 :  -mno-msse4.2;
+  const char *sse4_2 = has_sse4_2 ?  -msse4.2 :  -mno-sse4.2;
   const char *sse4_1 = has_sse4_1 ?  -msse4.1 :  -mno-sse4.1;
 
   options = concat (options, cx16, sahf, movbe, ase, pclmul,

Jakub


Fix PR 49014

2011-05-25 Thread Andrey Belevantsev

Hello,

This patch fixes PR 49014, yet another case of the insn with wrong 
reservation.  Approved by Uros in the PR audit trail, bootstrapped and 
regtested on x86-64/linux and committed to trunk.


Vlad, Bernd, I wonder if we can avoid having recog_memoized =0 insns that 
do not have proper DFA reservations (that is, they do not change the DFA 
state).  I see that existing practice allows this as shown by Bernd's patch 
to 48403, i.e. such insns do not count against issue_rate.  I would be 
happy to fix sel-sched in the same way.  However, both sel-sched ICEs as 
shown by PRs 48143 and 49014 really uncover the latent bugs in the backend. 
 So, is it possible to stop having such insns if scheduling is desired, or 
otherwise distinguish the insns that wrongly miss the proper DFA reservation?


Yours, Andrey

Index: gcc/ChangeLog
===
*** gcc/ChangeLog   (revision 174171)
--- gcc/ChangeLog   (working copy)
***
*** 1,3 
--- 1,8 
+ 2011-05-25  Andrey Belevantsev  a...@ispras.ru
+
+   PR rtl-optimization/49014
+   * config/i386/athlon.md (athlon_ssecomi): Change type to ssecomi.
+
  2011-05-25  Jakub Jelinek  ja...@redhat.com

PR target/49128
Index: gcc/config/i386/athlon.md
===
*** gcc/config/i386/athlon.md   (revision 174171)
--- gcc/config/i386/athlon.md   (working copy)
*** (define_insn_reservation athlon_ssecomi
*** 798,804 
 athlon-direct,athlon-fploadk8,athlon-fadd)
  (define_insn_reservation athlon_ssecomi 4
 (and (eq_attr cpu athlon,k8,generic64)
! (eq_attr type ssecmp))
 athlon-vector,athlon-fpsched,athlon-fadd)
  (define_insn_reservation athlon_ssecomi_amdfam10 3
 (and (eq_attr cpu amdfam10)
--- 798,804 
 athlon-direct,athlon-fploadk8,athlon-fadd)
  (define_insn_reservation athlon_ssecomi 4
 (and (eq_attr cpu athlon,k8,generic64)
! (eq_attr type ssecomi))
 athlon-vector,athlon-fpsched,athlon-fadd)
  (define_insn_reservation athlon_ssecomi_amdfam10 3
 (and (eq_attr cpu amdfam10)


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Andrew Stubbs

On 24/05/11 20:35, Joseph S. Myers wrote:

On Tue, 24 May 2011, Andrew Stubbs wrote:


I've created this new, simpler patch that converts

   (extend (mult a b))

into

   (mult (extend a) (extend b))

regardless of what 'a' and 'b' might be. (These are then simplified and
superfluous extends removed, of course.)


Are there some missing conditions here?  The two aren't equivalent in
general - (extend:SI (mult:HI a b)) multiplies the HImode values in HImode
(with modulo arithmetic on overflow) before extending the possibly wrapped
result to SImode.  You'd need a and b themselves to be extended from
narrower modes in such a way that if you interpret the extended values in
the signedness of the outer extension, the result of the multiplication is
exactly representable in the mode of the multiplication.  (For example, if
both values are extended from QImode, and all extensions have the same
signedness, that would be OK.  There are cases that are OK where not all
extensions have the same signedness, e.g. (sign_extend:DI (mult:SI a b))
where a and b are zero-extended from HImode or QImode, at least one from
QImode, though there the outer extension is equivalent to a
zero-extension.)


So, you're saying that promoting a regular multiply to a widening 
multiply isn't a valid transformation anyway? I suppose that does make 
sense. I knew something was too easy.


OK, I'll go try again. :)

Andrew


Re: [testsuite] remove XFAIL for all but ia64 for g++.dg/tree-ssa/pr43411.C

2011-05-25 Thread Rainer Orth
Janis Johnson jani...@codesourcery.com writes:

 Archived test results for 4.7.0 for most processors with C++ results have:

 XPASS: g++.dg/tree-ssa/pr43411.C scan-tree-dump-not optimized OBJ_TYPE_REF

 The only failures I could find were for ia64-linux and ia64-hpux.  This
 patch changes the xfail so it only applies to ia64-*-*.  OK for trunk?

Richard rejected a similar patch:

http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00054.html

Perhaps Jan can suggest the correct approach?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PING][PATCH 13/18] move TS_EXP to be a substructure of TS_TYPED

2011-05-25 Thread Richard Guenther
On Tue, May 24, 2011 at 7:34 PM, Nathan Froyd froy...@codesourcery.com wrote:
 `0On Mon, May 23, 2011 at 04:58:06PM +0200, Richard Guenther wrote:
 On Mon, May 23, 2011 at 4:18 PM, Nathan Froyd froy...@codesourcery.com 
 wrote:
  On 05/17/2011 11:31 AM, Nathan Froyd wrote:
  On 05/10/2011 04:18 PM, Nathan Froyd wrote:
  On 03/10/2011 11:23 PM, Nathan Froyd wrote:
  After all that, we can finally make tree_exp inherit from typed_tree.
  Quite anticlimatic.
 
  Ping.  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00559.html
 
  Ping^2.
 
  Ping^3 to put it in Richi's INBOX. ;)

 Ok ;)

 Please check for sizeof () uses of the structs you touched sofar.
 ISTR a bug about fold-checking.

 That doesn't apply here, because I'm not renaming the struct.  But I did
 find some problems with LTO when I was rebootstrapping prior to
 committing; not sure how I missed these the first time through, maybe I
 was mistakenly compiling without LTO support.  Since we now have things
 being dumped to LTO that don't have TREE_CHAIN, we need to take care to
 not access TREE_CHAIN on such things, which the patch below does.

 Tested on x86_64-unknown-linux-gnu.  OK to commit?

Ok.  Please see if you can adjust the lto-streamer-in/out.c machinery
to consistently handle the new TS_ classes.

Thanks,
Richard.

 -Nathan

 gcc/
        * tree.h (struct tree_exp): Inherit from struct tree_typed.
        * tree.c (initialize_tree_contains_struct): Mark TS_EXP as TS_TYPED
        instead of TS_COMMON.

 gcc/lto/
        * lto.c (lto_ft_typed): New function.
        (lto_ft_common): Call it.
        (lto_ft_constructor): Likewise.
        (lto_ft_expr): Likewise.
        (lto_fixup_prevailing_decls): Check for TS_COMMON before accessing
        TREE_CHAIN.

 diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
 index d64ba18..1067b51 100644
 --- a/gcc/lto/lto.c
 +++ b/gcc/lto/lto.c
 @@ -254,14 +254,20 @@ remember_with_vars (tree t)

  static void lto_fixup_types (tree);

 -/* Fix up fields of a tree_common T.  */
 +/* Fix up fields of a tree_typed T.  */

  static void
 -lto_ft_common (tree t)
 +lto_ft_typed (tree t)
  {
 -  /* Fixup our type.  */
   LTO_FIXUP_TREE (TREE_TYPE (t));
 +}
 +
 +/* Fix up fields of a tree_common T.  */

 +static void
 +lto_ft_common (tree t)
 +{
 +  lto_ft_typed (t);
   LTO_FIXUP_TREE (TREE_CHAIN (t));
  }

 @@ -398,7 +404,7 @@ lto_ft_constructor (tree t)
   unsigned HOST_WIDE_INT idx;
   constructor_elt *ce;

 -  LTO_FIXUP_TREE (TREE_TYPE (t));
 +  lto_ft_typed (t);

   for (idx = 0;
        VEC_iterate(constructor_elt, CONSTRUCTOR_ELTS (t), idx, ce);
 @@ -415,7 +421,7 @@ static void
  lto_ft_expr (tree t)
  {
   int i;
 -  lto_ft_common (t);
 +  lto_ft_typed (t);
   for (i = TREE_OPERAND_LENGTH (t) - 1; i = 0; --i)
     LTO_FIXUP_TREE (TREE_OPERAND (t, i));
  }
 @@ -2029,7 +2035,8 @@ lto_fixup_prevailing_decls (tree t)
  {
   enum tree_code code = TREE_CODE (t);
   LTO_NO_PREVAIL (TREE_TYPE (t));
 -  LTO_NO_PREVAIL (TREE_CHAIN (t));
 +  if (CODE_CONTAINS_STRUCT (code, TS_COMMON))
 +    LTO_NO_PREVAIL (TREE_CHAIN (t));
   if (DECL_P (t))
     {
       LTO_NO_PREVAIL (DECL_NAME (t));
 diff --git a/gcc/tree.c b/gcc/tree.c
 index 3357d84..9cc99fe 100644
 --- a/gcc/tree.c
 +++ b/gcc/tree.c
 @@ -380,6 +380,7 @@ initialize_tree_contains_struct (void)
        case TS_COMPLEX:
        case TS_SSA_NAME:
        case TS_CONSTRUCTOR:
 +       case TS_EXP:
          MARK_TS_TYPED (code);
          break;

 @@ -388,7 +389,6 @@ initialize_tree_contains_struct (void)
        case TS_TYPE_COMMON:
        case TS_LIST:
        case TS_VEC:
 -       case TS_EXP:
        case TS_BLOCK:
        case TS_BINFO:
        case TS_STATEMENT_LIST:
 diff --git a/gcc/tree.h b/gcc/tree.h
 index 805fe06..142237f 100644
 --- a/gcc/tree.h
 +++ b/gcc/tree.h
 @@ -1917,7 +1917,7 @@ enum omp_clause_default_kind
   (OMP_CLAUSE_SUBCODE_CHECK (NODE, 
 OMP_CLAUSE_DEFAULT)-omp_clause.subcode.default_kind)

  struct GTY(()) tree_exp {
 -  struct tree_common common;
 +  struct tree_typed typed;
   location_t locus;
   tree block;
   tree GTY ((special (tree_exp),



Re: [PATCH] Expand pow(x,n) into mulitplies in cse_sincos pass (PR46728, patch 2)

2011-05-25 Thread Richard Guenther
On Tue, May 24, 2011 at 10:35 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Here's a small patch to expand pow(x,n) for integer n using the
 powi(x,n) logic in the cse_sincos pass.  OK for trunk?

 For the next patch, I'll plan on expanding pow(x,n) for n in
 {0.5, 0.25, 0.75, 1./3., 1./6.}.  This logic will be added to
 gimple_expand_builtin_pow.

Ok.

Thanks,
Richard.

 Bill


 2011-05-24  Bill Schmidt  wschm...@linux.vnet.ibm.com
        PR tree-optimization/46728
        * tree-ssa-math-opts.c (gimple_expand_builtin_pow): New.
        (execute_cse_sincos): Add switch case for BUILT_IN_POW.

 Index: gcc/tree-ssa-math-opts.c
 ===
 --- gcc/tree-ssa-math-opts.c    (revision 174129)
 +++ gcc/tree-ssa-math-opts.c    (working copy)
 @@ -1024,6 +1024,39 @@ gimple_expand_builtin_powi (gimple_stmt_iterator *
   return NULL_TREE;
  }

 +/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
 +   with location info LOC.  If possible, create an equivalent and
 +   less expensive sequence of statements prior to GSI, and return an
 +   expession holding the result.  */
 +
 +static tree
 +gimple_expand_builtin_pow (gimple_stmt_iterator *gsi, location_t loc,
 +                          tree arg0, tree arg1)
 +{
 +  REAL_VALUE_TYPE c, cint;
 +  HOST_WIDE_INT n;
 +
 +  /* If the exponent isn't a constant, there's nothing of interest
 +     to be done.  */
 +  if (TREE_CODE (arg1) != REAL_CST)
 +    return NULL_TREE;
 +
 +  /* If the exponent is equivalent to an integer, expand it into
 +     multiplies when profitable.  */
 +  c = TREE_REAL_CST (arg1);
 +  n = real_to_integer (c);
 +  real_from_integer (cint, VOIDmode, n, n  0 ? -1 : 0, 0);
 +
 +  if (real_identical (c, cint)
 +       ((n = -1  n = 2)
 +         || (flag_unsafe_math_optimizations
 +              optimize_insn_for_speed_p ()
 +              powi_cost (n) = POWI_MAX_MULTS)))
 +    return gimple_expand_builtin_powi (gsi, loc, arg0, n);
 +
 +  return NULL_TREE;
 +}
 +
  /* Go through all calls to sin, cos and cexpi and call execute_cse_sincos_1
    on the SSA_NAME argument of each of them.  Also expand powi(x,n) into
    an optimal number of multiplies, when n is a constant.  */
 @@ -1065,6 +1098,23 @@ execute_cse_sincos (void)
                    cfg_changed |= execute_cse_sincos_1 (arg);
                  break;

 +               CASE_FLT_FN (BUILT_IN_POW):
 +                 arg0 = gimple_call_arg (stmt, 0);
 +                 arg1 = gimple_call_arg (stmt, 1);
 +
 +                 loc = gimple_location (stmt);
 +                 result = gimple_expand_builtin_pow (gsi, loc, arg0, arg1);
 +
 +                 if (result)
 +                   {
 +                     tree lhs = gimple_get_lhs (stmt);
 +                     gimple new_stmt = gimple_build_assign (lhs, result);
 +                     gimple_set_location (new_stmt, loc);
 +                     unlink_stmt_vdef (stmt);
 +                     gsi_replace (gsi, new_stmt, true);
 +                   }
 +                 break;
 +
                CASE_FLT_FN (BUILT_IN_POWI):
                  arg0 = gimple_call_arg (stmt, 0);
                  arg1 = gimple_call_arg (stmt, 1);





Re: C6X port 4/11: Backtracking scheduler

2011-05-25 Thread Hans-Peter Nilsson
On Tue, 10 May 2011, Bernd Schmidt wrote:
 On C6X, every jump instruction has 5 delay slots which can be filled
 with normally scheduled instructions. With an issue width of 8
 insns/cycle, this means that up to 40 insns can be issued after the jump
 insn before the jump's side-effect takes place. I didn't particularaly
 feel like using reorg.c to deal with this,

No kidding... multi-delay-slot bugs just waiting for you...

 hence these scheduler patches.

THANK YOU for these first steps!

brgds, H-P



Re: Patch ping #2

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 12:00 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Yes, I mean when -fstack-usage is enabled.  Thus, make
 -fstack-usage -Wframe-larger-than behave the same as
 -fstack-usage -Wstack-usage.  Or do you say that -Wstack-usage
 doesn't require -fstack-usage?

 Yes, -Wstack-usage doesn't require -fstack-usage.

 If it doesn't then I hope -Wstack-usage does not have effects on code
 generation?

 Neither -fstack-usage nor -Wstack-usage has any effect on code generation.
 The former generates a .su file and the latter issues a warning.

 And if not then why can't -Wframe-larger-than just be more precise on some
 targets?

 -Wframe-larger-than is documented to work on any targets and to be imprecise.
 So we would need to list the targets for which it is precise (or the targets
 for which it isn't precise) and maintain it.  By contrast, if -Wstack-usage
 returns something, then the answer is always precise.  Moreover I think that
 the common name is a big advantage.

Thanks for explaining.  The patch is ok.

Thanks,
Richard.

 --
 Eric Botcazou



Faster streaming of enums

2011-05-25 Thread Jan Hubicka
Hi,
after fixing 1 byte i/o function call and most of hash table overhead,
functions to handle ulebs and slebs shows top in profile.  We use them in
many cases where we know value range of the operand will fit in 1 byte. In
particular to handle enums.
This is also dangerous since we generally assume enums to be in their value
range.

This patch adds i/o bits for enums and integers in range that should inline
well and add some sanity checking.

I converted only tree streamer tags, but if accepted, I will convert more.

Bootstrapped/regtested x86_64-linux, OK?

* lto-streamer-out.c (output_record_start): Use lto_output_enum
(lto_output_tree): Use output_record_start.
* lto-streamer-in.c (input_record_start): Use lto_input_enum
(lto_get_pickled_tree): Use input_record_start.
* lto-section-in.c (lto_section_overrun): Turn into fatal error.
(lto_value_range_error): New function.
* lto-streamer.h (lto_value_range_error): Declare.
(lto_output_int_in_range, lto_input_int_in_range): New functions.
(lto_output_enum, lto_input_enum): New macros.
Index: lto-streamer-out.c
===
*** lto-streamer-out.c  (revision 174175)
--- lto-streamer-out.c  (working copy)
*** output_sleb128 (struct output_block *ob,
*** 270,281 
  
  /* Output the start of a record with TAG to output block OB.  */
  
! static void
  output_record_start (struct output_block *ob, enum LTO_tags tag)
  {
!   /* Make sure TAG fits inside an unsigned int.  */
!   gcc_assert (tag == (enum LTO_tags) (unsigned) tag);
!   output_uleb128 (ob, tag);
  }
  
  
--- 270,279 
  
  /* Output the start of a record with TAG to output block OB.  */
  
! static inline void
  output_record_start (struct output_block *ob, enum LTO_tags tag)
  {
!   lto_output_enum (ob-main_stream, LTO_tags, LTO_NUM_TAGS, tag);
  }
  
  
*** lto_output_tree (struct output_block *ob
*** 1401,1407 
 will instantiate two different nodes for the same object.  */
output_record_start (ob, LTO_tree_pickle_reference);
output_uleb128 (ob, ix);
!   output_uleb128 (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
  }
else if (lto_stream_as_builtin_p (expr))
  {
--- 1399,1405 
 will instantiate two different nodes for the same object.  */
output_record_start (ob, LTO_tree_pickle_reference);
output_uleb128 (ob, ix);
!   output_record_start (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
  }
else if (lto_stream_as_builtin_p (expr))
  {
Index: lto-streamer-in.c
===
*** lto-streamer-in.c   (revision 174175)
--- lto-streamer-in.c   (working copy)
*** lto_input_string (struct data_in *data_i
*** 231,241 
  
  /* Return the next tag in the input block IB.  */
  
! static enum LTO_tags
  input_record_start (struct lto_input_block *ib)
  {
!   enum LTO_tags tag = (enum LTO_tags) lto_input_uleb128 (ib);
!   return tag;
  }
  
  
--- 231,240 
  
  /* Return the next tag in the input block IB.  */
  
! static inline enum LTO_tags
  input_record_start (struct lto_input_block *ib)
  {
!   return lto_input_enum (ib, LTO_tags, LTO_NUM_TAGS);
  }
  
  
*** lto_get_pickled_tree (struct lto_input_b
*** 2558,2564 
enum LTO_tags expected_tag;
  
ix = lto_input_uleb128 (ib);
!   expected_tag = (enum LTO_tags) lto_input_uleb128 (ib);
  
result = lto_streamer_cache_get (data_in-reader_cache, ix);
gcc_assert (result
--- 2557,2563 
enum LTO_tags expected_tag;
  
ix = lto_input_uleb128 (ib);
!   expected_tag = input_record_start (ib);
  
result = lto_streamer_cache_get (data_in-reader_cache, ix);
gcc_assert (result
Index: lto-section-in.c
===
*** lto-section-in.c(revision 174175)
--- lto-section-in.c(working copy)
*** lto_get_function_in_decl_state (struct l
*** 483,488 
  void
  lto_section_overrun (struct lto_input_block *ib)
  {
!   internal_error (bytecode stream: trying to read %d bytes 
! after the end of the input buffer, ib-p - ib-len);
  }
--- 483,498 
  void
  lto_section_overrun (struct lto_input_block *ib)
  {
!   fatal_error (bytecode stream: trying to read %d bytes 
!  after the end of the input buffer, ib-p - ib-len);
! }
! 
! /* Report out of range value.  */
! 
! void
! lto_value_range_error (const char *purpose, HOST_WIDE_INT val,
!  HOST_WIDE_INT min, HOST_WIDE_INT max)
! {
!   fatal_error (%s out of range: Range is %i to %i, value is %i,
!  purpose, (int)min, (int)max, (int)val);
  }
Index: lto-streamer.h
===
*** lto-streamer.h  (revision 174175)
--- lto-streamer.h  (working copy)
*** extern int 

[v3] libstdc++/49141

2011-05-25 Thread Paolo Carlini

Hi,

committed to mainline and 4_6-branch.

Thanks,
Paolo.


2011-05-24  Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/49141
* testsuite/26_numerics/complex/cons/48760.cc: Use dg-require-c-std.
* testsuite/26_numerics/complex/cons/48760_c++0x.cc: Likewise.
* testsuite/26_numerics/headers/cmath/19322.cc: Likewise.

Index: testsuite/26_numerics/complex/cons/48760.cc
===
--- testsuite/26_numerics/complex/cons/48760.cc (revision 174112)
+++ testsuite/26_numerics/complex/cons/48760.cc (working copy)
@@ -1,3 +1,5 @@
+// { dg-require-c-std  }
+
 // Copyright (C) 2011 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
Index: testsuite/26_numerics/complex/cons/48760_c++0x.cc
===
--- testsuite/26_numerics/complex/cons/48760_c++0x.cc   (revision 174112)
+++ testsuite/26_numerics/complex/cons/48760_c++0x.cc   (working copy)
@@ -1,4 +1,5 @@
 // { dg-options -std=gnu++0x }
+// { dg-require-c-std  }
 
 // Copyright (C) 2011 Free Software Foundation, Inc.
 //
Index: testsuite/26_numerics/headers/cmath/19322.cc
===
--- testsuite/26_numerics/headers/cmath/19322.cc(revision 174112)
+++ testsuite/26_numerics/headers/cmath/19322.cc(working copy)
@@ -1,4 +1,6 @@
-// Copyright (C) 2005, 2009 Free Software Foundation, Inc.
+// { dg-require-c-std  }
+
+// Copyright (C) 2005, 2009, 2010, 2011 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -15,11 +17,9 @@
 // with this library; see the file COPYING3.  If not see
 // http://www.gnu.org/licenses/.
 
-
 #include cmath
 #include testsuite_hooks.h
 
-#if _GLIBCXX_USE_C99_MATH
 // libstdc++/19322
 void test01()
 {
@@ -27,12 +27,9 @@
 
   VERIFY( !std::isnan(3.0) );
 }
-#endif
 
 int main()
 {
-#if _GLIBCXX_USE_C99_MATH
   test01();
-#endif
   return 0;
 }


Re: C6X port 9/11: Allow defining attributes in terms of another

2011-05-25 Thread Bernd Schmidt
On 05/25/2011 08:56 AM, Hans-Peter Nilsson wrote:
 On Tue, 10 May 2011, Bernd Schmidt wrote:
 
 I've found it useful to use a construct such as the following:

 (define_attr units64
   unknown,d,d_addr,l,m,s,dl,ds,dls,ls
   (const_string unknown))

 (define_attr units64p
   unknown,d,d_addr,l,m,s,dl,ds,dls,ls
   (attr units64))

 to define one attribute in terms of another by default,
 
 So it's just the units64p default value taken from the units64
 default value or units64p gets its default value from the final
 units64 value?

units64p has the final value of units64, unless an insn explicitly gives
it a different value. This is because C64X+ is really very similar to
C64X in most respects. We then select which of the various units
definitions to use for a given CPU:

(define_attr units
  unknown,d,d_addr,l,m,s,dl,ds,dls,ls
  (cond [(eq_attr cpu c62x)
   (attr units62)
 (eq_attr cpu c67x)
   (attr units67)
 (eq_attr cpu c67xp)
   (attr units67p)
 (eq_attr cpu c64x)
   (attr units64)
 (eq_attr cpu c64xp)
   (attr units64p)
 (eq_attr cpu c674x)
   (attr units674)
]
(const_string unknown)))

 allowing
 individual insn patterns to override the definition of units64p where
 necessary. This patch adds support for this in genattrtab.
 
 I'm not sure I get it, and I think I would be helped by seeing
 the documentation update. ;)

I'm not sure where you're looking for added documentation for this
patch. It just generalizes the define_attr mechanism a little to allow
one more kind of expression.


Bernd


Re: Faster streaming of enums

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 11:45 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 after fixing 1 byte i/o function call and most of hash table overhead,
 functions to handle ulebs and slebs shows top in profile.  We use them in
 many cases where we know value range of the operand will fit in 1 byte. In
 particular to handle enums.
 This is also dangerous since we generally assume enums to be in their value
 range.

 This patch adds i/o bits for enums and integers in range that should inline
 well and add some sanity checking.

 I converted only tree streamer tags, but if accepted, I will convert more.

 Bootstrapped/regtested x86_64-linux, OK?

        * lto-streamer-out.c (output_record_start): Use lto_output_enum
        (lto_output_tree): Use output_record_start.
        * lto-streamer-in.c (input_record_start): Use lto_input_enum
        (lto_get_pickled_tree): Use input_record_start.
        * lto-section-in.c (lto_section_overrun): Turn into fatal error.
        (lto_value_range_error): New function.
        * lto-streamer.h (lto_value_range_error): Declare.
        (lto_output_int_in_range, lto_input_int_in_range): New functions.
        (lto_output_enum, lto_input_enum): New macros.
 Index: lto-streamer-out.c
 ===
 *** lto-streamer-out.c  (revision 174175)
 --- lto-streamer-out.c  (working copy)
 *** output_sleb128 (struct output_block *ob,
 *** 270,281 

  /* Output the start of a record with TAG to output block OB.  */

 ! static void
  output_record_start (struct output_block *ob, enum LTO_tags tag)
  {
 !   /* Make sure TAG fits inside an unsigned int.  */
 !   gcc_assert (tag == (enum LTO_tags) (unsigned) tag);
 !   output_uleb128 (ob, tag);
  }


 --- 270,279 

  /* Output the start of a record with TAG to output block OB.  */

 ! static inline void
  output_record_start (struct output_block *ob, enum LTO_tags tag)
  {
 !   lto_output_enum (ob-main_stream, LTO_tags, LTO_NUM_TAGS, tag);
  }


 *** lto_output_tree (struct output_block *ob
 *** 1401,1407 
         will instantiate two different nodes for the same object.  */
        output_record_start (ob, LTO_tree_pickle_reference);
        output_uleb128 (ob, ix);
 !       output_uleb128 (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
      }
    else if (lto_stream_as_builtin_p (expr))
      {
 --- 1399,1405 
         will instantiate two different nodes for the same object.  */
        output_record_start (ob, LTO_tree_pickle_reference);
        output_uleb128 (ob, ix);
 !       output_record_start (ob, lto_tree_code_to_tag (TREE_CODE (expr)));

I'd prefer lto_output_enum here as we don't really start a new output
record but just emit something for a sanity check.

      }
    else if (lto_stream_as_builtin_p (expr))
      {
 Index: lto-streamer-in.c
 ===
 *** lto-streamer-in.c   (revision 174175)
 --- lto-streamer-in.c   (working copy)
 *** lto_input_string (struct data_in *data_i
 *** 231,241 

  /* Return the next tag in the input block IB.  */

 ! static enum LTO_tags
  input_record_start (struct lto_input_block *ib)
  {
 !   enum LTO_tags tag = (enum LTO_tags) lto_input_uleb128 (ib);
 !   return tag;
  }


 --- 231,240 

  /* Return the next tag in the input block IB.  */

 ! static inline enum LTO_tags
  input_record_start (struct lto_input_block *ib)
  {
 !   return lto_input_enum (ib, LTO_tags, LTO_NUM_TAGS);
  }


 *** lto_get_pickled_tree (struct lto_input_b
 *** 2558,2564 
    enum LTO_tags expected_tag;

    ix = lto_input_uleb128 (ib);
 !   expected_tag = (enum LTO_tags) lto_input_uleb128 (ib);

    result = lto_streamer_cache_get (data_in-reader_cache, ix);
    gcc_assert (result
 --- 2557,2563 
    enum LTO_tags expected_tag;

    ix = lto_input_uleb128 (ib);
 !   expected_tag = input_record_start (ib);

Likewise use input_enum.

    result = lto_streamer_cache_get (data_in-reader_cache, ix);
    gcc_assert (result
 Index: lto-section-in.c
 ===
 *** lto-section-in.c    (revision 174175)
 --- lto-section-in.c    (working copy)
 *** lto_get_function_in_decl_state (struct l
 *** 483,488 
  void
  lto_section_overrun (struct lto_input_block *ib)
  {
 !   internal_error (bytecode stream: trying to read %d bytes 
 !                 after the end of the input buffer, ib-p - ib-len);
  }
 --- 483,498 
  void
  lto_section_overrun (struct lto_input_block *ib)
  {
 !   fatal_error (bytecode stream: trying to read %d bytes 
 !              after the end of the input buffer, ib-p - ib-len);
 ! }
 !
 ! /* Report out of range value.  */
 !
 ! void
 ! lto_value_range_error (const char *purpose, HOST_WIDE_INT val,
 !                      HOST_WIDE_INT min, HOST_WIDE_INT max)
 ! {
 !   fatal_error (%s out of range: Range is %i to %i, value is %i,
 !              

Re: Faster streaming of enums

2011-05-25 Thread Jan Hubicka
  *** lto_output_tree (struct output_block *ob
  *** 1401,1407 
          will instantiate two different nodes for the same object.  */
         output_record_start (ob, LTO_tree_pickle_reference);
         output_uleb128 (ob, ix);
  !       output_uleb128 (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
       }
     else if (lto_stream_as_builtin_p (expr))
       {
  --- 1399,1405 
          will instantiate two different nodes for the same object.  */
         output_record_start (ob, LTO_tree_pickle_reference);
         output_uleb128 (ob, ix);
  !       output_record_start (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
 
 I'd prefer lto_output_enum here as we don't really start a new output
 record but just emit something for a sanity check.

OK, I wondered what is cleaner, will update the patch.
  + /* Output VAL into OBS and verify it is in range MIN...MAX that is 
  supposed
  +    to be compile time constant.
  +    Be host independent, limit range to 31bits.  */
  +
  + static inline void
  + lto_output_int_in_range (struct lto_output_stream *obs,
  +                        HOST_WIDE_INT min,
  +                        HOST_WIDE_INT max,
  +                        HOST_WIDE_INT val)
  + {
  +   HOST_WIDE_INT range = max - min;
  +
  +   gcc_checking_assert (val = min  val = max  range  0
  +                       range  0x7fff);
  +
  +   val -= min;
  +   lto_output_1_stream (obs, val  255);
  +   if (range = 0xff)
  +     lto_output_1_stream (obs, (val  8)  255);
  +   if (range = 0x)
  +     lto_output_1_stream (obs, (val  16)  255);
  +   if (range = 0xff)
  +     lto_output_1_stream (obs, (val  24)  255);
 
 so you didn't want to create a bitpack_pack_int_in_range and use
 bitpacks for enums?  I suppose for some of the remaining cases
 packing them into existing bitpacks would be preferable?

Well, in my TODO list is to have both.  Where we don't bitpatck enums with
other values (that is the most common case of enums) this way we produce less
overhead and have extra sanity check that the bits unused by enum are really 0.

I guess final API should have both lto_output_enum and lto_bitpack_output_enum.
I don't really care if the first have the implementation above or just creates 
its
own bitpack to handle the value.
  + {
  +   HOST_WIDE_INT range = max - min;
  +   HOST_WIDE_INT val = lto_input_1_unsigned (ib);
  +
  +   gcc_checking_assert (range  0);
 
 The assert doesn't match the one during output.

Hmm, OK, will match.

Honza


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Joseph S. Myers
On Wed, 25 May 2011, Andrew Stubbs wrote:

 So, you're saying that promoting a regular multiply to a widening multiply
 isn't a valid transformation anyway? I suppose that does make sense. I knew

In general, yes.  RTL always has modulo semantics (except for division and 
remainder by -1); all optimizations based on undefinedness of overflow (in 
the absence of -fwrapv) happen at tree/GIMPLE level, where signed and 
unsigned types are still distinct.  (So you could promote a regular 
multiply of signed types at GIMPLE level in the absence of 
-fwrapv/-ftrapv, but not at RTL level and not for unsigned types at GIMPLE 
level.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Don't ICE in function_and_variable_visibility on Tru64 UNIX (PR middle-end/49062)

2011-05-25 Thread Rainer Orth
Almost 400 c++ and libstdc++ testcases ICE on Tru64 UNIX since Jan's
patch

2011-05-06  Jan Hubicka  j...@suse.cz

* cgraph.c (cgraph_add_thunk): Create real function node instead
of alias node; finalize it and mark needed/reachale; arrange visibility
to be right and add it into the corresponding same comdat group list.
(dump_cgraph_node): Dump thunks.

as described in the PR.

He provided the following patch in private mail.  I tested it on
alpha-dec-osf5.1b by rebuilding cc1plus and rerunning the g++ and
libstdc++-v3 testsuites: all failures were gone.

Approved in private mail, committed to mainline.

Rainer


2011-05-25  Jan Hubicka  j...@suse.cz

PR middle-end/49062
* ipa.c (function_and_variable_visibility): Only add to same
comdat group list if DECL_ONE_ONLY.

diff --git a/gcc/ipa.c b/gcc/ipa.c
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -897,7 +897,7 @@ function_and_variable_visibility (bool w
{
  DECL_COMDAT (node-decl) = 1;
  DECL_COMDAT_GROUP (node-decl) = DECL_COMDAT_GROUP 
(decl_node-decl);
- if (!node-same_comdat_group)
+ if (DECL_ONE_ONLY (decl_node-decl)  !node-same_comdat_group)
{
  node-same_comdat_group = decl_node;
  if (!decl_node-same_comdat_group)


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Andrew Haley
On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

What does full memory barrier here mean?

+@table @code
+@item void __builtin_ia32_pause (void)
+Generates the @code{pause} machine instruction with full memory barrier.
+@end table

There a memory clobber, but no barrier instruction AFAICS.  The
doc needs to explain it a bit better.

Andrew.


[PATCH] Ignore TYPE_DECLs for canonical type compute in LTO

2011-05-25 Thread Richard Guenther

Just figured that we'd get TYPE_DECLs and FUNCTION_DECLs in aggregate
types.  But we should treat layout-compatible structs as same,
regardless of the above.

LTO profile-bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-05-25  Richard Guenther  rguent...@suse.de

* gimple.c (iterative_hash_canonical_type): Skip non-FIELD_DECLs.
(gimple_canonical_types_compatible_p): Likewise.

Index: gcc/gimple.c
===
--- gcc/gimple.c(revision 174118)
+++ gcc/gimple.c(working copy)
@@ -4376,10 +4382,11 @@ iterative_hash_canonical_type (tree type
   tree f;
 
   for (f = TYPE_FIELDS (type), nf = 0; f; f = TREE_CHAIN (f))
-   {
- v = iterative_hash_canonical_type (TREE_TYPE (f), v);
- nf++;
-   }
+   if (TREE_CODE (f) == FIELD_DECL)
+ {
+   v = iterative_hash_canonical_type (TREE_TYPE (f), v);
+   nf++;
+ }
 
   v = iterative_hash_hashval_t (nf, v);
 }
@@ -4688,6 +4695,13 @@ gimple_canonical_types_compatible_p (tre
 f1  f2;
 f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2))
  {
+   /* Skip non-fields.  */
+   while (f1  TREE_CODE (f1) != FIELD_DECL)
+ f1 = TREE_CHAIN (f1);
+   while (f2  TREE_CODE (f2) != FIELD_DECL)
+ f2 = TREE_CHAIN (f2);
+   if (!f1 || !f2)
+ break;
/* The fields must have the same name, offset and type.  */
if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
|| !gimple_compare_field_offset (f1, f2)


Re: Faster streaming of enums

2011-05-25 Thread Richard Guenther
On Wed, 25 May 2011, Jan Hubicka wrote:

   *** lto_output_tree (struct output_block *ob
   *** 1401,1407 
           will instantiate two different nodes for the same object.  */
          output_record_start (ob, LTO_tree_pickle_reference);
          output_uleb128 (ob, ix);
   !       output_uleb128 (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
        }
      else if (lto_stream_as_builtin_p (expr))
        {
   --- 1399,1405 
           will instantiate two different nodes for the same object.  */
          output_record_start (ob, LTO_tree_pickle_reference);
          output_uleb128 (ob, ix);
   !       output_record_start (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
  
  I'd prefer lto_output_enum here as we don't really start a new output
  record but just emit something for a sanity check.
 
 OK, I wondered what is cleaner, will update the patch.
   + /* Output VAL into OBS and verify it is in range MIN...MAX that is 
   supposed
   +    to be compile time constant.
   +    Be host independent, limit range to 31bits.  */
   +
   + static inline void
   + lto_output_int_in_range (struct lto_output_stream *obs,
   +                        HOST_WIDE_INT min,
   +                        HOST_WIDE_INT max,
   +                        HOST_WIDE_INT val)
   + {
   +   HOST_WIDE_INT range = max - min;
   +
   +   gcc_checking_assert (val = min  val = max  range  0
   +                       range  0x7fff);
   +
   +   val -= min;
   +   lto_output_1_stream (obs, val  255);
   +   if (range = 0xff)
   +     lto_output_1_stream (obs, (val  8)  255);
   +   if (range = 0x)
   +     lto_output_1_stream (obs, (val  16)  255);
   +   if (range = 0xff)
   +     lto_output_1_stream (obs, (val  24)  255);
  
  so you didn't want to create a bitpack_pack_int_in_range and use
  bitpacks for enums?  I suppose for some of the remaining cases
  packing them into existing bitpacks would be preferable?
 
 Well, in my TODO list is to have both.  Where we don't bitpatck enums with
 other values (that is the most common case of enums) this way we produce less
 overhead and have extra sanity check that the bits unused by enum are really 
 0.
 
 I guess final API should have both lto_output_enum and 
 lto_bitpack_output_enum.
 I don't really care if the first have the implementation above or just 
 creates its
 own bitpack to handle the value.

Ok.

   + {
   +   HOST_WIDE_INT range = max - min;
   +   HOST_WIDE_INT val = lto_input_1_unsigned (ib);
   +
   +   gcc_checking_assert (range  0);
  
  The assert doesn't match the one during output.
 
 Hmm, OK, will match.

Patch is ok with the changes.

Thanks,
Richard.

Re: PATCH: Add pause intrinsic

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 12:26 PM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

The name also sounds odd to me (reminds me of Fortran PAUSE ...).

Richard.

 Andrew.



4.6: do not divide by 0 on insane profile

2011-05-25 Thread Jan Hubicka
Hi,
cgraph_decide_recursive_inlining may decide to divide by 0 when profile is read
but it is small enough, so even count of 0 is considered as possibly hot.  This
particularly happens when profile was not really read after all.

The problem is fixed on mainline differently. This patch just obviously plugs
the symptom.

Bootstrapped/regtested x86_64-linux, comitted.

Index: ChangeLog
===
--- ChangeLog   (revision 174182)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-05-18  Jan Hubicka  j...@suse.cz
+
+   PR tree-optimization/44897 
+   * ipa-inline.c (cgraph_decide_recursive_inlining): Do not divide
+   by zero for insane profiles.
+
 2011-05-24  Eric Botcazou  ebotca...@adacore.com
 
* config/sparc/sparc.c (sparc_option_override): If not set by the user,
Index: ipa-inline.c
===
--- ipa-inline.c(revision 173893)
+++ ipa-inline.c(working copy)
@@ -895,7 +895,7 @@ cgraph_decide_recursive_inlining (struct
  continue;
}
 
-  if (max_count)
+  if (max_count  node-count)
{
   if (!cgraph_maybe_hot_edge_p (curr))
{


Re: [patch ada]: Fix bootstrap for Ada

2011-05-25 Thread Kai Tietz
2011/5/24 Arnaud Charlet char...@adacore.com:
  I'm confused. The above looks wrong to me: it does not return an empty
  string,
  it returns a pointer to an uninitialized string, which cannot be right
  (and
  should generate a warning :-)

 No, static vars are implicitly zero initialized when not explicitly
 initialized.

 Hmm I see. Still, the above code is not easy to read IMO.

 I'd suggest instead the following which is easier to read and understand:

 __gnat_to_canonical_file_list_next (void)
 {
  static char empty[] = ;
  return empty;
 }

 That's actually a change I was about to commit since we've done it recently
 at AdaCore, so OK with the above variant.

 Arno

Ok applied patch as you suggested at revision 174185. Not sure that
sure if this is more readable, but anyway.

Regards,
Kai


Re: [testsuite] remove XFAIL for all but ia64 for g++.dg/tree-ssa/pr43411.C

2011-05-25 Thread Jan Hubicka
Am Wed 25 May 2011 11:04:06 AM CEST schrieb Richard Guenther  
richard.guent...@gmail.com:



On Wed, May 25, 2011 at 10:38 AM, Rainer Orth
r...@cebitec.uni-bielefeld.de wrote:

Janis Johnson jani...@codesourcery.com writes:


Archived test results for 4.7.0 for most processors with C++ results have:

XPASS: g++.dg/tree-ssa/pr43411.C scan-tree-dump-not optimized   
OBJ_TYPE_REF


The only failures I could find were for ia64-linux and ia64-hpux.  This
patch changes the xfail so it only applies to ia64-*-*.  OK for trunk?


Richard rejected a similar patch:

       http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00054.html

Perhaps Jan can suggest the correct approach?


We should verify that the call to val is inlined in all functions.
Maybe rename it to something larger and scan the optimized
dump so that name doesn't appear.

Indeed, this seems to be safest approach I can think of.
If function is supposed to be optimized out completely by early  
passes, we should just search release_ssa.  It is not the case here  
and dumping IPA info for inlining all instance would be bit tricky.


Honza


Re: [pph] Regularize Streaming (issue4528096)

2011-05-25 Thread Diego Novillo
On Tue, May 24, 2011 at 22:42, Lawrence Crowl cr...@google.com wrote:

   For TEMPLATE_DECL, also stream DECL_MEMBER_TEMPLATE_P.

We don't really need to handle this.  DECL_MEMBER_TEMPLATE_P is using
DECL_LANG_FLAG_1.  All the lang_flag fields are already handled in
pph_stream_unpack_value_fields (and its counterpart).  Besides, this
patch is writing DECL_MEMBER_TEMPLATE_P but it is not reading it back,
this will cause stream synchronization problems

 The code edits do NOT conform with the gcc style.  This is deliberate
 so that diff reports sensible differences.  I will make a separate
 patch to fix the style.

*gasp* I am horrified! ;)

 +    case USING_DECL:
 +    case VAR_DECL:
        {
 +      /* FIXME pph: Should we merge DECL_INITIAL into lang_specific? */

Hm?


 +    case TEMPLATE_DECL:
     {
 +      pph_output_tree_or_ref_1 (stream, DECL_INITIAL (expr), ref_p, 3);
 +      pph_stream_write_lang_specific (stream, expr, ref_p);
       pph_output_tree_or_ref_1 (stream, DECL_TEMPLATE_RESULT (expr), ref_p, 
 3);
       pph_output_tree_or_ref_1 (stream, DECL_TEMPLATE_PARMS (expr), ref_p, 3);
       pph_output_tree_or_ref_1 (stream, DECL_CONTEXT (expr), ref_p, 3);
 +      pph_output_uchar (stream, DECL_MEMBER_TEMPLATE_P (expr));

There does not seem to be a read operation for DECL_MEMBER_TEMPLATE_P.


Diego.


Re: PATCH: Add pause intrinsic

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 3:26 AM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.


There are read/load memory barrier, write/store memory barrier and full/general
memory barrier.  You can find them at

http://www.kernel.org/doc/Documentation/memory-barriers.txt

Should I include a pointer to it?

-- 
H.J.


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Andrew Stubbs

On 24/05/11 20:35, Joseph S. Myers wrote:

On Tue, 24 May 2011, Andrew Stubbs wrote:


I've created this new, simpler patch that converts

   (extend (mult a b))

into

   (mult (extend a) (extend b))

regardless of what 'a' and 'b' might be. (These are then simplified and
superfluous extends removed, of course.)


Are there some missing conditions here?  The two aren't equivalent in
general - (extend:SI (mult:HI a b)) multiplies the HImode values in HImode
(with modulo arithmetic on overflow) before extending the possibly wrapped
result to SImode.


Ok, I've now modified my patch to prevent it widening regular multiplies.

It now converts

   (extend (mult (extend a) (extend b)))

to

   (mult (newextend a) (newextend b))

But I also have it convert

   (extend (mult (shift a) (extend b)))

to

   (mult (newextend (shift a)) (newextend b))

The latter case is to catch widening multiplies that extract a subreg 
using a shift. I don't understand why it doesn't just use subreg in the 
first place, but apparently it doesn't, and changing it to do that would 
no doubt break many existing machine descriptions.


The latter case also happens to catch cases where an extend is 
represented by (ashiftrt (ashift x)), which is nice.


I know that, potentially, not all shifted operands are going to be 
widening multiplies, but I *think* this should be safe because other 
random shift values are unlikely to match a real widening mult 
instruction (and if they do then the code would already be broken). If 
somebody knows a reason why this isn't safe then I think I'm going to 
need some help figuring out what conditions to use.


OK?

Andrew
2011-05-25  Bernd Schmidt  ber...@codesourcery.com
	Andrew Stubbs  a...@codesourcery.com

	gcc/
	* simplify-rtx.c (simplify_unary_operation_1): Canonicalize widening
	multiplies.
	* doc/md.texi (Canonicalization of Instructions): Document widening
	multiply canonicalization.

	gcc/testsuite/
	* gcc.target/arm/mla-2.c: New test.

--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5840,6 +5840,21 @@ Equality comparisons of a group of bits (usually a single bit) with zero
 will be written using @code{zero_extract} rather than the equivalent
 @code{and} or @code{sign_extract} operations.
 
+@cindex @code{mult}, canonicalization of
+@item
+@code{(sign_extend:@var{m1} (mult:@var{m2} (sign_extend:@var{m2} @var{x})
+(sign_extend:@var{m2} @var{y})))} is converted to @code{(mult:@var{m1}
+(sign_extend:@var{m1} @var{x}) (sign_extend:@var{m1} @var{y}))}, and likewise
+for @code{zero_extend}.
+
+@item
+@code{(sign_extend:@var{m1} (mult:@var{m2} (ashiftrt:@var{m2}
+@var{x} @var{s}) (sign_extend:@var{m2} @var{y})))} is converted to
+@code{(mult:@var{m1} (sign_extend:@var{m1} (ashiftrt:@var{m2} @var{x} @var{s}))
+(sign_extend:@var{m1} @var{y}))}, and likewise for patterns using @code{zero_extend}
+and @code{lshiftrt}.  If the second operand of @code{mult} is also a shift,
+then that is extended also.
+
 @end itemize
 
 Further canonicalization rules are defined in the function
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1000,6 +1000,34 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 	   GET_CODE (XEXP (XEXP (op, 0), 1)) == LABEL_REF)
 	return XEXP (op, 0);
 
+  /* Extending a widening multiplication should be canonicalized to
+	 a wider widening multiplication.  */
+  if (GET_CODE (op) == MULT)
+	{
+	  rtx lhs = XEXP (op, 0);
+	  rtx rhs = XEXP (op, 1);
+	  enum rtx_code lcode = GET_CODE (lhs);
+	  enum rtx_code rcode = GET_CODE (rhs);
+
+	  /* Widening multiplies usually extend both operands, but sometimes
+	 they use a shift to extract a portion of a register. We assume
+	 it is safe to widen all such operands because other examples
+	 won't match real instructions.  */
+	  if ((lcode == SIGN_EXTEND || lcode == ASHIFTRT)
+	   (rcode == SIGN_EXTEND || rcode == ASHIFTRT))
+	{
+	  enum machine_mode lmode = GET_MODE (lhs);
+	  enum machine_mode rmode = GET_MODE (lhs);
+	  return simplify_gen_binary (MULT, mode,
+	  simplify_gen_unary (SIGN_EXTEND,
+			  mode,
+			  lhs, lmode),
+	  simplify_gen_unary (SIGN_EXTEND,
+			  mode,
+			  rhs, rmode));
+	}
+	}
+
   /* Check for a sign extension of a subreg of a promoted
 	 variable, where the promotion is sign-extended, and the
 	 target mode is the same as the variable's promotion.  */
@@ -1071,6 +1099,34 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 	   GET_MODE_SIZE (mode) = GET_MODE_SIZE (GET_MODE (XEXP (op, 0
 	return rtl_hooks.gen_lowpart_no_emit (mode, op);
 
+  /* Extending a widening multiplication should be canonicalized to
+	 a wider widening multiplication.  */
+  if (GET_CODE (op) == MULT)
+	{
+	  rtx lhs = XEXP (op, 0);
+	  rtx rhs = XEXP (op, 1);
+	  enum rtx_code lcode = GET_CODE (lhs);
+	  enum rtx_code rcode = GET_CODE (rhs);
+
+	  /* Widening multiplies 

Re: PATCH: Add pause intrinsic

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 3:31 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Wed, May 25, 2011 at 12:26 PM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 The name also sounds odd to me (reminds me of Fortran PAUSE ...).


__builtin_ia32_pause is the C intrinsic for x86 machine instruction.
I don't think people will get confused by its name.


-- 
H.J.


Re: RFA PR 48770

2011-05-25 Thread Bernd Schmidt
On 05/24/2011 03:34 PM, Jeff Law wrote:
 
 This has gone latent on the trunk, but the underlying issue hasn't been
 resolved.
 
 ira.c::update_equiv_regs can create REG_EQUIV notes for equivalences
 which are local to a block rather than the traditional function-wide
 equivalences we typically work with.
 
 This occurs when we have an insn that loads a pseudo from a MEM and the
 pseudo is used within only a single block and the MEM remains unchanged
 through the life of the pseudo.
 
 Starting with the assumption that we're going to create a block local
 pseudo under the rules noted above, consider this RTL:
 
 (set (reg X) (some address))
 (set (reg Y) (mem (reg X)))
 (use Y)
 
 
 We're going to create an equivalence between (reg Y) and its memory
 location in update_equiv_regs.  Assume IRA is able to allocate a hard
 reg for reg X, but not reg Y.
 
 reload's strategy in this situation will be to remove the insn which
 creates the equivalence between reg Y and the memory location.  Uses of
 reg Y will be replaced with the equivalent memory location.
 
 That's all fine and good, except reload uses delete_dead_insn, which
 deletes the equivalencing insn, but also recursively tries to remove the
 prior insn if it becomes dead as a result of removing the equivalencing
 insn.
 
 Anyway, continuing with our example, reg X gets a hard reg, so our RTL
 will look something like
 
 (set (reg 0) (some address))
 (set (reg Y) (mem (reg 0)))
 (use Y)
 
 Then we remove the equivalencing insn resulting in
 
 (set (reg 0) (some address)
 (use Y)
 
 And we recurse from delete_dead_insn and determine that the first insn
 was dead as well, so it gets removed leaving:
 
 (use Y)
 
 We then replace Y with its equivalent memory location
 
 (use (mem (reg 0))
 
 At which point we lose because hard reg 0 is no longer initialized.
 
 
 The code in question is literally 20 years old and predates running any
 real dead code elimination after reload.  ISTM the right thing to do is
 stop using delete_dead_insn in this code and let the post-reload DCE
 pass do its job.  That allows us to continue to record the block local
 equivalence.

Sounds like the right thing to do. OK. (Can we eliminate the other caller?)

I've looked at code generation; it appears unchanged on i686-linux,
which I think is the expected result. There are minor differences in
assembly output on mips64-linux. If you want to look at it, I'm
attaching a testcase - compile with -O2 -fno-reorder-blocks.


Bernd
  typedef _Bool bool;
  typedef struct {  volatile int counter; }
 atomic_t;
struct list_head {  struct list_head *next, *prev; };
   typedef void (*ctor_fn_t)(void);
struct kref {  atomic_t refcount; };
  struct kobject {  const char *name;  struct list_head entry;  struct kobject 
*parent;  struct kset *kset;  struct kobj_type *ktype;  struct sysfs_dirent 
*sd;  struct kref kref;  unsigned int state_initialized:1;  unsigned int 
state_in_sysfs:1;  unsigned int state_add_uevent_sent:1;  unsigned int 
state_remove_uevent_sent:1;  unsigned int uevent_suppress:1; };
  static inline __attribute__((always_inline)) void  _do_trace_module_get 
 (void (*probe)(char *name, bool wait, unsigned long ip)) { return -38; }
  struct module_kobject {  struct kobject kobj;  struct module *mod;  struct 
kobject *drivers_dir;  struct module_param_attrs *mp; };
enum module_state {  MODULE_STATE_LIVE,  MODULE_STATE_COMING,  
MODULE_STATE_GOING, };
  struct module {  enum module_state state;struct list_head list;char 
name[(64 - sizeof(unsigned long))];struct module_kobject mkobj;  struct 
module_attribute *modinfo_attrs;  const char *version;  const char *srcversion; 
 struct kobject *holders_dir;const struct kernel_symbol *syms;  const 
unsigned long *crcs;  unsigned int num_syms;struct kernel_param *kp;  
unsigned int num_kp;unsigned int num_gpl_syms;  const struct kernel_symbol 
*gpl_syms;  const unsigned long *gpl_crcs; 
 struct list_head modules_which_use_me;struct task_struct *waiter;void 
(*exit)(void);   struct module_ref {   unsigned int incs;   unsigned int decs;  
} *refptr;  ctor_fn_t *ctors;  unsigned int num_ctors;  };
static inline __attribute__((always_inline)) int bscnl_emit(char *buf, int 
buflen, int rbot, int rtop, int len) {  if (len  0)   len += scnprintf(buf + 
len, buflen - len, ,);  if (rbot == rtop)   len += scnprintf(buf + len, 
buflen - len, %d, rbot);  else   len += scnprintf(buf + len, buflen - len, 
%d-%d, rbot, rtop);  return len; }
int bitmap_scnlistprintf(char *buf, unsigned int buflen,  const unsigned long 
*maskp, int nmaskbits) {  int len = 0;   int cur, rbot, rtop;   if (buflen == 
0)   return 0;  buf[0] = 0;   rbot = cur = find_next_bit((maskp), (nmaskbits), 
0);  while (cur  nmaskbits) {   rtop = cur;   cur = find_next_bit(maskp, 
nmaskbits, cur+1);   if (cur = nmaskbits || cur  rtop + 1) {len = 
bscnl_emit(buf, buflen, rbot, rtop, len);rbot = cur;   }  }  return len; }


Re: [PATCH] Fix a typo in i386 host_detect_local_cpu (PR target/49128)

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 12:15 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 Committed as obvious.

 2011-05-25  Jakub Jelinek  ja...@redhat.com

        PR target/49128
        * config/i386/driver-i386.c (host_detect_local_cpu): Fix a typo.

 --- gcc/config/i386/driver-i386.c       (revision 174170)
 +++ gcc/config/i386/driver-i386.c       (revision 174171)
 @@ -696,7 +696,7 @@ const char *host_detect_local_cpu (int a
       const char *bmi = has_bmi ?  -mbmi :  -mno-bmi;
       const char *tbm = has_tbm ?  -mtbm :  -mno-tbm;
       const char *avx = has_avx ?  -mavx :  -mno-avx;
 -      const char *sse4_2 = has_sse4_2 ?  -msse4.2 :  -mno-msse4.2;
 +      const char *sse4_2 = has_sse4_2 ?  -msse4.2 :  -mno-sse4.2;
       const char *sse4_1 = has_sse4_1 ?  -msse4.1 :  -mno-sse4.1;

       options = concat (options, cx16, sahf, movbe, ase, pclmul,


Thanks.


-- 
H.J.


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Joseph S. Myers
On Wed, 25 May 2011, Andrew Stubbs wrote:

 I know that, potentially, not all shifted operands are going to be widening
 multiplies, but I *think* this should be safe because other random shift
 values are unlikely to match a real widening mult instruction (and if they do
 then the code would already be broken). If somebody knows a reason why this
 isn't safe then I think I'm going to need some help figuring out what
 conditions to use.

Random supposition like that is not a sensible basis for modifying GCC.

I haven't managed to produce an example of code demonstrating the problem, 
but that's probably because I'm not sufficiently familiar with all the RTL 
optimizers.  Where is the guarantee that the inputs to these functions 
must represent real instructions, or that the outputs will only be used if 
they represent real instructions?  Where are the assertions to ensure that 
wrong code is not quietly generated if this is not the case?  Where is the 
documentation of what instruction patterns it is not permitted to put in 
.md files because they would violate the assumptions about what 
instructions you are permitted to represent in RTL?  How have you checked 
there are no existing problematic instruction patterns?

RTL has defined abstract semantics and RTL transformations should be ones 
that are valid in accordance with those semantics, with proper assertions 
if there are additional constraints on the input passed to a function.  
This means actually counting the numbers of variable bits in the operands 
to determine whether the multiplication could overflow.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [ C++ 4.6 Patch] allow uninitialized const or reference members with -fpermissive

2011-05-25 Thread Jason Merrill

OK, thanks.

Jason


Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-25 Thread H.J. Lu
On Fri, May 13, 2011 at 9:10 AM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 05/13/2011 04:26 PM, Joseph S. Myers wrote:
 On Fri, 13 May 2011, Bernd Schmidt wrote:

 The following patch adds a target hook and a corresponding LIBGCC2_
 macro which control the generation of library function names. It also
 makes libgcc-std.ver a generated file, built from libgcc-std.ver.in by
 replacing some placeholders with the correct prefixes. While I was
 there, I also added functionality to generate a version of this file
 with an extra underscore for the Blackfin port.

 But the linker was changed to use C symbol names in linker scripts and I
 was told that this script in GCC would be removed in consequence.

 http://sourceware.org/ml/binutils/2010-12/msg00375.html

 Oh well. Dropped.

 Any new target macro for use only in target libraries should, in my view,
 be poisoned in the host system.h from the start to ensure that no-one
 accidentally adds definitions to the host tm.h.  This would be alongside
 the existing

 /* Target macros only used for code built for the target, that have
    moved to libgcc-tm.h.  */
  #pragma GCC poison DECLARE_LIBRARY_RENAMES

 Done. New patch below, now testing.



I think it may have caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49160


-- 
H.J.


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Andrew Stubbs

On 25/05/11 14:19, Joseph S. Myers wrote:

RTL has defined abstract semantics and RTL transformations should be ones
that are valid in accordance with those semantics, with proper assertions
if there are additional constraints on the input passed to a function.
This means actually counting the numbers of variable bits in the operands
to determine whether the multiplication could overflow.


Ok, fair enough, so how can I identify a valid subreg extraction that is 
defined in terms of shifts?


The case that I care about is simple enough:

   (mult:SI (ashiftrt:SI (reg:SI rM)
 (const_int 16))
(sign_extend:SI (subreg:HI (reg:SI rN) 0)))

I guess that's just equivalent to this:

   (mult:SI (sign_extend:SI (subreg:HI (reg:SI rM) 4)))
(sign_extend:SI (subreg:HI (reg:SI rN) 0)))

but it chooses not to represent it that way, which is less than helpful 
in this case.


So I could just scan for that exact pattern, or perhaps look for shift 
sizes that are half the size of the register, or some such thing, but is 
that general enough? Or is it too general again?


Is there anything else I've missed?

Andrew


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Uros Bizjak
On Tue, May 24, 2011 at 8:28 PM, H.J. Lu hjl.to...@gmail.com wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 gcc/

 2011-05-24  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_builtins): Add IX86_BUILTIN_PAUSE.
        (bdesc_special_args): Add pause intrinsic.

        * config/i386/i386.md (UNSPEC_PAUSE): New.
        (pause): Likewise.
        (*pause): Likewise.
        * config/i386/ia32intrin.h (__pause): Likewise.

        * doc/extend.texi (X86 Built-in Functions): Add documentation for
        pause intrinsic.

 gcc/testsuite/

 2011-05-24  H.J. Lu  hongjiu...@intel.com

         * gcc.target/i386/pause-1.c: New.

OK.

Thanks,
Uros.


Re: [ARM] fix C++ EH interoperability

2011-05-25 Thread Nathan Sidwell

On 05/23/11 16:54, Andrew Haley wrote:

On 05/23/2011 04:52 PM, Nathan Sidwell wrote:

This patch fixes an interoperability issue with code generated by ARM's EABI
compiler.



This patch results has been tested for arm-linux, and independently tested by
ARM with mixed RVCT-generated code confirming the defect has been fixed.

ok?


What did the Java test results look like?


They are unchanged.

nathan

--
Nathan Sidwell



Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-25 Thread Bernd Schmidt
On 05/25/2011 01:37 PM, H.J. Lu wrote:

 I think it may have caused:
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49160

Looks like it. Not quite sure how to fix it yet. Do you know what files
such as i386/64/_divtc3.c are trying to achieve?


Bernd



Re: [PATCH PR45098, 4/10] Iv init cost.

2011-05-25 Thread Richard Sandiford
Sorry for being so late.  I was just curious...

Tom de Vries vr...@codesourcery.com writes:
 The init cost of an iv will in general not be zero. It will be
 exceptional that the iv register happens to be initialized with the
 proper value at no cost. In general, there will at the very least be a
 regcopy or a const set.

 2011-05-05  Tom de Vries  t...@codesourcery.com

   PR target/45098
   * tree-ssa-loop-ivopts.c (determine_iv_cost): Prevent
   cost_base.cost == 0.
 Index: gcc/tree-ssa-loop-ivopts.c
 ===
 --- gcc/tree-ssa-loop-ivopts.c(revision 173380)
 +++ gcc/tree-ssa-loop-ivopts.c(working copy)
 @@ -4688,6 +4688,8 @@ determine_iv_cost (struct ivopts_data *d
  
base = cand-iv-base;
cost_base = force_var_cost (data, base, NULL);
 +  if (cost_base.cost == 0)
 +  cost_base.cost = COSTS_N_INSNS (1);
cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)), data-speed);
  
cost = cost_step + adjust_setup_cost (data, cost_base.cost);

...why does this reasoning apply only to this call to force_var_cost?

Richard


Re: C6X port 9/11: Allow defining attributes in terms of another

2011-05-25 Thread Hans-Peter Nilsson
On Wed, 25 May 2011, Bernd Schmidt wrote:

 I'm not sure where you're looking for added documentation for this
 patch.

I guess no surprise that'd be md.texi node Defining Attributes,
or an updated example in node Attr Example since the
documentation for default basically just refers to it.  Or
perhaps better node Expressions where (attr x) is documented,
since it says it's mostly useful for numeric attributes and not
so for non-numeric attributes.  Perhaps add after that sentence
It can also be used to yield the value of another attribute,
useful to e.g. set the value of the current attribute if they
share a domain.  You can probably find a better wording. :)

 It just generalizes the define_attr mechanism a little to
 allow one more kind of expression.

Yes, the documentation is a bit terse, isn't it.  But the idea
that you can redirect to another attribute instead of referring
to it in a conditional like in eq_attr seems to me new enough to
warrant a line.

brgds, H-P


Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-05-25 Thread Joseph S. Myers
On Wed, 25 May 2011, Andrew Stubbs wrote:

 On 25/05/11 14:19, Joseph S. Myers wrote:
  RTL has defined abstract semantics and RTL transformations should be ones
  that are valid in accordance with those semantics, with proper assertions
  if there are additional constraints on the input passed to a function.
  This means actually counting the numbers of variable bits in the operands
  to determine whether the multiplication could overflow.
 
 Ok, fair enough, so how can I identify a valid subreg extraction that is
 defined in terms of shifts?

The shift must be by a positive constant amount, strictly less than the 
precision (GET_MODE_PRECISION) of the mode (of the value being shifted).  
If that applies, the relevant number of bits is the precision of the mode 
minus the number of bits of the shift.  For an extension, just take the 
number of bits in the inner mode.  Add the two numbers of bits; if the 
result does not exceed the number of bits in the mode (of the operands and 
the multiplication) then the multiplication won't overflow.

As in your patch, either all the operands must be sign-extensions / 
arithmetic shifts (and then the result is equivalent to a widening signed 
multiply), or all must be zero-extensions / logical shifts (and the result 
is a widening unsigned multiply).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-25 Thread Bernd Schmidt
On 05/25/2011 01:45 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 6:42 AM, Bernd Schmidt ber...@codesourcery.com 
 wrote:
 On 05/25/2011 01:37 PM, H.J. Lu wrote:

 I think it may have caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49160

 Looks like it. Not quite sure how to fix it yet. Do you know what files
 such as i386/64/_divtc3.c are trying to achieve?

 
 It provides backward compatibility with symbol versioning:
 
 [hjl@gnu-4 64]$ readelf -s /lib64/libgcc_s.so.1| grep __powitf2
 52: 003e8a80d170   167 FUNCGLOBAL DEFAULT   12 
 __powitf2@@GCC_4.3.0
 54: 003e8a80d170   167 FUNCGLOBAL DEFAULT   12 __powitf2@GCC_4.0.0
 [hjl@gnu-4 64]$

That leaves me as clueless as before. Why does i386/64 need this but not
other targets (such as i386/32), and why only those three functions
(from the ones in libgcc)?

Anyhow, below is one possible way of fixing it.


Bernd

PR bootstrap/49160
* libgcc2.h (__powisf2, __powidf2, __powitf2, __powixf2,
__mulsc3, __muldc3, __mulxc3, __multc3, __divsc3, __divdc3,
__divxc3, __divtc3): Wrap definitions in #ifndef.

Index: gcc/libgcc2.h
===
--- gcc/libgcc2.h   (revision 174187)
+++ gcc/libgcc2.h   (working copy)
@@ -324,23 +324,48 @@ typedef int shift_count_type __attribute
 #define __parityDI2__NDW(parity,2)
 
 #define __clz_tab  __N(clz_tab)
+#define __bswapsi2 __N(bswapsi2)
+#define __bswapdi2 __N(bswapdi2)
+#define __udiv_w_sdiv  __N(udiv_w_sdiv)
+#define __clear_cache  __N(clear_cache)
+#define __enable_execute_stack __N(enable_execute_stack)
+
+#ifndef __powisf2
 #define __powisf2  __N(powisf2)
+#endif
+#ifndef __powidf2
 #define __powidf2  __N(powidf2)
+#endif
+#ifndef __powitf2
 #define __powitf2  __N(powitf2)
+#endif
+#ifndef __powixf2
 #define __powixf2  __N(powixf2)
-#define __bswapsi2 __N(bswapsi2)
-#define __bswapdi2 __N(bswapdi2)
+#endif
+#ifndef __mulsc3
 #define __mulsc3   __N(mulsc3)
+#endif
+#ifndef __muldc3
 #define __muldc3   __N(muldc3)
+#endif
+#ifndef __mulxc3
 #define __mulxc3   __N(mulxc3)
+#endif
+#ifndef __multc3
 #define __multc3   __N(multc3)
+#endif
+#ifndef __divsc3
 #define __divsc3   __N(divsc3)
+#endif
+#ifndef __divdc3
 #define __divdc3   __N(divdc3)
+#endif
+#ifndef __divxc3
 #define __divxc3   __N(divxc3)
+#endif
+#ifndef __divtc3
 #define __divtc3   __N(divtc3)
-#define __udiv_w_sdiv  __N(udiv_w_sdiv)
-#define __clear_cache  __N(clear_cache)
-#define __enable_execute_stack __N(enable_execute_stack)
+#endif
 
 extern DWtype __muldi3 (DWtype, DWtype);
 extern DWtype __divdi3 (DWtype, DWtype);


Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread Uros Bizjak
On Tue, May 24, 2011 at 5:54 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression?

If this is the case, then please change q_regs_operand predicate to
accept just QI_REG_P registers.

Uros.


[PATCH PING] unreviewed tree-slimming patches

2011-05-25 Thread Nathan Froyd
These patches:

  (C, C++, middle-end)
  [PATCH 14/18] move TS_STATEMENT_LIST to be a substructure of TS_TYPED
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00560.html

  (C, Java, middle-end)
  [PATCH 18/18] make TS_BLOCK a substructure of TS_BASE
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00564.html

are still pending review.  Jason commented on the TS_STATEMENT_LIST patch, but
the discussion didn't come to a resolution.  I forgot to CC the TS_BLOCK patch
to the Java folks the first time around.

Thanks,
-Nathan


C++ PATCH to cp_common_init_ts to fix crash in print_node

2011-05-25 Thread Jason Merrill
Trying to print a TYPE_ARGUMENT_PACK in the debugger with debug_tree 
crashes because print_node assumes that all types have TS_COMMON.  Fixed 
thus.


Tested x86_64-pc-linux-gnu, applying to trunk as obvious.
commit 7e5c923a908bffb2d8f8404f6cc7fd81a85bf932
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 24 23:16:23 2011 -0400

	* cp-objcp-common.c (cp_common_init_ts): TYPE_ARGUMENT_PACK has
	TS_COMMON.

diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c
index ed85491..df6b1dd 100644
--- a/gcc/cp/cp-objcp-common.c
+++ b/gcc/cp/cp-objcp-common.c
@@ -241,6 +241,7 @@ cp_common_init_ts (void)
   MARK_TS_COMMON (UNDERLYING_TYPE);
   MARK_TS_COMMON (BASELINK);
   MARK_TS_COMMON (TYPE_PACK_EXPANSION);
+  MARK_TS_COMMON (TYPE_ARGUMENT_PACK);
   MARK_TS_COMMON (DECLTYPE_TYPE);
   MARK_TS_COMMON (BOUND_TEMPLATE_TEMPLATE_PARM);
   MARK_TS_COMMON (UNBOUND_CLASS_TEMPLATE);


C++ PATCH for c++/48292 (variadics and member templates)

2011-05-25 Thread Jason Merrill
Several parts of the variadic template code have had trouble dealing 
with partial instantiation; this is another one.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0bbe297555a3e6585f1668266d965745df352ba4
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 24 23:20:29 2011 -0400

	PR c++/48292
	* pt.c (tsubst_decl) [PARM_DECL]: Handle partial instantiation of
	function parameter pack.
	(tsubst_pack_expansion): Likewise.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bd9aeba..fc84314 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8711,7 +8711,12 @@ tsubst_pack_expansion (tree t, tree args, tsubst_flags_t complain,
 		 have the wrong value for a recursive call.  Just make a
 		 dummy decl, since it's only used for its type.  */
 	  arg_pack = tsubst_decl (parm_pack, args, complain);
-	  arg_pack = make_fnparm_pack (arg_pack);
+	  if (arg_pack  FUNCTION_PARAMETER_PACK_P (arg_pack))
+		/* Partial instantiation of the parm_pack, we can't build
+		   up an argument pack yet.  */
+		arg_pack = NULL_TREE;
+	  else
+		arg_pack = make_fnparm_pack (arg_pack);
 	}
 	}
   else
@@ -9801,14 +9806,14 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 if (DECL_TEMPLATE_PARM_P (t))
   SET_DECL_TEMPLATE_PARM_P (r);
 
-	/* An argument of a function parameter pack is not a parameter
-	   pack.  */
-	FUNCTION_PARAMETER_PACK_P (r) = false;
-
 if (expanded_types)
   /* We're on the Ith parameter of the function parameter
  pack.  */
   {
+		/* An argument of a function parameter pack is not a parameter
+		   pack.  */
+		FUNCTION_PARAMETER_PACK_P (r) = false;
+
 /* Get the Ith type.  */
 type = TREE_VEC_ELT (expanded_types, i);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic109.C b/gcc/testsuite/g++.dg/cpp0x/variadic109.C
new file mode 100644
index 000..0ec69af
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic109.C
@@ -0,0 +1,17 @@
+// PR c++/48292
+// { dg-options -std=c++0x }
+
+template typename... Args int g(Args...);
+
+template int N = 0
+struct A
+{
+template typename... Args
+static auto f(Args... args) - decltype(g(args...));
+};
+
+int main()
+{
+A::f();
+return 0;
+}


Re: Fix for libobjc/48177. Can I apply it to 4.6 as well ?

2011-05-25 Thread Nicola Pero
 This patch fixes libobjc/48177.  I applied it to trunk.
 
 I'd like to apply this patch to the 4.6 branch too.  Do I need permission 
 from 
 a Release Manager ?

 They are always welcome to chime in, though, in this case the libobjc 
 maintainer can approve it.

Thanks Mike

I browsed the archives of gcc-patches for a while and as far as I can see, you 
are right
and other maintainers do approve patches for the 4.6 branch (in their own 
areas) without
waiting for a Release Manager to double-approve each patch (and it makes sense).

So I applied it to the 4.6 branch too. :-)

Thanks



C++ PATCH for c++/45080 (lambda conversion in templates)

2011-05-25 Thread Jason Merrill
The lambda conversion operator isn't added to CLASSTYPE_DECL_LIST, so it 
got lost on instantiation.  But since we cut some corners building it up 
to reduce runtime overhead, it's easier to just add it again at 
instantiation time.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 3b93aba17af31a772141a871c3299250dbbda714
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 01:21:49 2011 -0400

	PR c++/45080
	* pt.c (instantiate_class_template_1): Call maybe_add_lambda_conv_op.
	* semantics.c (lambda_function): Check COMPLETE_OR_OPEN_TYPE_P.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index fc84314..bb4515b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8566,6 +8566,9 @@ instantiate_class_template_1 (tree type)
 	}
 }
 
+  if (CLASSTYPE_LAMBDA_EXPR (type))
+maybe_add_lambda_conv_op (type);
+
   /* Set the file and line number information to whatever is given for
  the class itself.  This puts error messages involving generated
  implicit functions at a predictable point, and the same point
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 50f25f0..55ad117 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -8145,7 +8145,8 @@ lambda_function (tree lambda)
 type = lambda;
   gcc_assert (LAMBDA_TYPE_P (type));
   /* Don't let debug_tree cause instantiation.  */
-  if (CLASSTYPE_TEMPLATE_INSTANTIATION (type)  !COMPLETE_TYPE_P (type))
+  if (CLASSTYPE_TEMPLATE_INSTANTIATION (type)
+   !COMPLETE_OR_OPEN_TYPE_P (type))
 return NULL_TREE;
   lambda = lookup_member (type, ansi_opname (CALL_EXPR),
 			  /*protect=*/0, /*want_type=*/false);
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv5.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv5.C
new file mode 100644
index 000..53d8e99
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv5.C
@@ -0,0 +1,15 @@
+// PR c++/45080
+// { dg-options -std=c++0x }
+
+typedef void(*pfn)();
+
+templatetypename=int
+void f()
+{
+  pfn fn = []{};
+}
+
+void test()
+{
+  f();
+}


C++ PATCH for c++/45418 (list-initialization of member array)

2011-05-25 Thread Jason Merrill
The code in perform_member_init for handling arrays of non-trivial 
classes needed a tweak to handle list-initialization.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ca84b75b33c26be3e9cf2894f4c8b08e3a5cac73
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 00:45:38 2011 -0400

	PR c++/45418
	* init.c (perform_member_init): Handle list-initialization
	of array of non-trivial class type.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 5f30275..6336dd7 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -549,6 +549,8 @@ perform_member_init (tree member, tree init)
 	{
 	  gcc_assert (TREE_CHAIN (init) == NULL_TREE);
 	  init = TREE_VALUE (init);
+	  if (BRACE_ENCLOSED_INITIALIZER_P (init))
+		init = digest_init (type, init, tf_warning_or_error);
 	}
 	  if (init == NULL_TREE
 	  || same_type_ignoring_top_level_qualifiers_p (type,
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist50.C b/gcc/testsuite/g++.dg/cpp0x/initlist50.C
new file mode 100644
index 000..ef4e72c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist50.C
@@ -0,0 +1,21 @@
+// PR c++/45418
+// { dg-options -std=c++0x }
+
+struct A1 { };
+struct A2 {
+  A2();
+};
+
+template class T struct B {
+  T ar[1];
+  B(T t):ar({t}) {}
+};
+
+int main(){
+  Bint bi{1};
+  A1 a1;
+  BA1 ba1{a1};
+  A2 a2;
+  A2 a2r[1]{{a2}};
+  BA2 ba2{a2};
+}


C++ PATCH for c++/48935 (ICE with invalid enum scope)

2011-05-25 Thread Jason Merrill
Checking constructor_name_p doesn't work for an enum, and there's no 
reason to check it for non-classes anyway.  The change to 
cp_parser_invalid_type_name is to avoid saying that a scoped enum is a 
class; now it will print the actual tag used in defining the type.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit bef993e717fdccbde6acd7bde7aed2770cc1a95f
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 01:44:53 2011 -0400

	PR c++/48935
	* parser.c (cp_parser_constructor_declarator_p): Don't check
	constructor_name_p for enums.
	(cp_parser_diagnose_invalid_type_name): Correct error message.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3493e44..db2cb96 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2534,7 +2534,7 @@ cp_parser_diagnose_invalid_type_name (cp_parser *parser,
 		  %qT is a dependent scope,
 		  parser-scope, id, parser-scope);
   else if (TYPE_P (parser-scope))
-	error_at (location, %qE in class %qT does not name a type,
+	error_at (location, %qE in %q#T does not name a type,
 		  id, parser-scope);
   else
 	gcc_unreachable ();
@@ -19589,7 +19589,7 @@ cp_parser_constructor_declarator_p (cp_parser *parser, bool friend_p)
   /* If we have a class scope, this is easy; DR 147 says that S::S always
  names the constructor, and no other qualified name could.  */
   if (constructor_p  nested_name_specifier
-   TYPE_P (nested_name_specifier))
+   CLASS_TYPE_P (nested_name_specifier))
 {
   tree id = cp_parser_unqualified_id (parser,
 	  /*template_keyword_p=*/false,
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum16.C b/gcc/testsuite/g++.dg/cpp0x/enum16.C
new file mode 100644
index 000..ebb4868
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum16.C
@@ -0,0 +1,6 @@
+// PR c++/48935
+// { dg-options -std=c++0x }
+
+enum class ENUM { a };
+
+ENUM::Type func() { return ENUM::a; } // { dg-error does not name a type }
diff --git a/gcc/testsuite/g++.dg/parse/error15.C b/gcc/testsuite/g++.dg/parse/error15.C
index 2352193..607a1db 100644
--- a/gcc/testsuite/g++.dg/parse/error15.C
+++ b/gcc/testsuite/g++.dg/parse/error15.C
@@ -12,7 +12,7 @@ namespace N
 
 N::A f2;  // { dg-error 1:invalid use of template-name 'N::A' without an argument list }
 N::INVALID f3;// { dg-error 1:'INVALID' in namespace 'N' does not name a type }
-N::C::INVALID f4; // { dg-error 1:'INVALID' in class 'N::C' does not name a type }
+N::C::INVALID f4; // { dg-error 1:'INVALID' in 'struct N::C' does not name a type }
 N::K f6;  // { dg-error 1:'K' in namespace 'N' does not name a type }
 typename N::A f7;
 // { dg-error 13:invalid use of template-name 'N::A' without an argument list 13 { target *-*-* } 17 }
@@ -22,7 +22,7 @@ struct B
 {
   N::A f2;// { dg-error 3:invalid use of template-name 'N::A' without an argument list }
   N::INVALID f3;  // { dg-error 3:'INVALID' in namespace 'N' does not name a type }
-  N::C::INVALID f4;   // { dg-error 3:'INVALID' in class 'N::C' does not name a type }
+  N::C::INVALID f4;   // { dg-error 3:'INVALID' in 'struct N::C' does not name a type }
   N::K f6;// { dg-error 3:'K' in namespace 'N' does not name a type }
   typename N::A f7;
 // { dg-error 15:invalid use of template-name 'N::A' without an argument list 15 { target *-*-* } 27 }
@@ -33,7 +33,7 @@ struct C
 {
   N::A f2;// { dg-error 3:invalid use of template-name 'N::A' without an argument list }
   N::INVALID f3;  // { dg-error 3:'INVALID' in namespace 'N' does not name a type }
-  N::C::INVALID f4;   // { dg-error 3:'INVALID' in class 'N::C' does not name a type }
+  N::C::INVALID f4;   // { dg-error 3:'INVALID' in 'struct N::C' does not name a type }
   N::K f6;// { dg-error 3:'K' in namespace 'N' does not name a type }
   typename N::A f7;   // { dg-error 15:invalid use of template-name 'N::A' without an argument list }
 };


[v3] Use noexcept in thread and mutex

2011-05-25 Thread Paolo Carlini

Hi,

tested x86_64-linux, committed to mainline.

Thanks,
Paolo.


2011-05-25  Paolo Carlini  paolo.carl...@oracle.com

* include/std/thread: Use noexcept throughout per the FDIS.
* include/std/mutex: Likewise.
Index: include/std/thread
===
--- include/std/thread  (revision 174185)
+++ include/std/thread  (working copy)
@@ -72,7 +72,7 @@
   native_handle_type   _M_thread;
 
 public:
-  id() : _M_thread() { }
+  id() noexcept : _M_thread() { }
 
   explicit
   id(native_handle_type __id) : _M_thread(__id) { }
@@ -82,11 +82,11 @@
   friend class hashthread::id;
 
   friend bool
-  operator==(thread::id __x, thread::id __y)
+  operator==(thread::id __x, thread::id __y) noexcept
   { return __gthread_equal(__x._M_thread, __y._M_thread); }
 
   friend bool
-  operator(thread::id __x, thread::id __y)
+  operator(thread::id __x, thread::id __y) noexcept
   { return __x._M_thread  __y._M_thread; }
 
   templateclass _CharT, class _Traits
@@ -121,11 +121,11 @@
 id _M_id;
 
   public:
-thread() = default;
+thread() noexcept = default;
 thread(thread) = delete;
 thread(const thread) = delete;
 
-thread(thread __t)
+thread(thread __t) noexcept
 { swap(__t); }
 
 templatetypename _Callable, typename... _Args
@@ -145,7 +145,7 @@
 
 thread operator=(const thread) = delete;
 
-thread operator=(thread __t)
+thread operator=(thread __t) noexcept
 {
   if (joinable())
std::terminate();
@@ -154,11 +154,11 @@
 }
 
 void
-swap(thread __t)
+swap(thread __t) noexcept
 { std::swap(_M_id, __t._M_id); }
 
 bool
-joinable() const
+joinable() const noexcept
 { return !(_M_id == id()); }
 
 void
@@ -168,7 +168,7 @@
 detach();
 
 thread::id
-get_id() const
+get_id() const noexcept
 { return _M_id; }
 
 /** @pre thread is joinable
@@ -179,7 +179,7 @@
 
 // Returns a value that hints at the number of hardware thread contexts.
 static unsigned int
-hardware_concurrency()
+hardware_concurrency() noexcept
 { return 0; }
 
   private:
@@ -198,23 +198,23 @@
   inline thread::_Impl_base::~_Impl_base() = default;
 
   inline void
-  swap(thread __x, thread __y)
+  swap(thread __x, thread __y) noexcept
   { __x.swap(__y); }
 
   inline bool
-  operator!=(thread::id __x, thread::id __y)
+  operator!=(thread::id __x, thread::id __y) noexcept
   { return !(__x == __y); }
 
   inline bool
-  operator=(thread::id __x, thread::id __y)
+  operator=(thread::id __x, thread::id __y) noexcept
   { return !(__y  __x); }
 
   inline bool
-  operator(thread::id __x, thread::id __y)
+  operator(thread::id __x, thread::id __y) noexcept
   { return __y  __x; }
 
   inline bool
-  operator=(thread::id __x, thread::id __y)
+  operator=(thread::id __x, thread::id __y) noexcept
   { return !(__x  __y); }
 
   // DR 889.
@@ -250,12 +250,12 @@
 
 /// get_id
 inline thread::id
-get_id() { return thread::id(__gthread_self()); }
+get_id() noexcept { return thread::id(__gthread_self()); }
 
 #ifdef _GLIBCXX_USE_SCHED_YIELD
 /// yield
 inline void
-yield()
+yield() noexcept
 { __gthread_yield(); }
 #endif
 
Index: include/std/mutex
===
--- include/std/mutex   (revision 174185)
+++ include/std/mutex   (working copy)
@@ -1,6 +1,6 @@
 // mutex -*- C++ -*-
 
-// Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+// Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
 // Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
@@ -70,9 +70,9 @@
 typedef __native_type* native_handle_type;
 
 #ifdef __GTHREAD_MUTEX_INIT
-constexpr mutex() : _M_mutex(__GTHREAD_MUTEX_INIT) { }
+constexpr mutex() noexcept : _M_mutex(__GTHREAD_MUTEX_INIT) { }
 #else
-mutex()
+mutex() noexcept
 {
   // XXX EAGAIN, ENOMEM, EPERM, EBUSY(may), EINVAL(may)
   __GTHREAD_MUTEX_INIT_FUNCTION(_M_mutex);
@@ -95,7 +95,7 @@
 }
 
 bool
-try_lock()
+try_lock() noexcept
 {
   // XXX EINVAL, EAGAIN, EBUSY
   return !__gthread_mutex_trylock(_M_mutex);
@@ -188,7 +188,7 @@
 }
 
 bool
-try_lock()
+try_lock() noexcept
 {
   // XXX EINVAL, EAGAIN, EBUSY
   return !__gthread_recursive_mutex_trylock(_M_mutex);
@@ -247,7 +247,7 @@
 }
 
 bool
-try_lock()
+try_lock() noexcept
 {
   // XXX EINVAL, EAGAIN, EBUSY
   return !__gthread_mutex_trylock(_M_mutex);
@@ -354,7 +354,7 @@
 }
 
 bool
-try_lock()
+try_lock() noexcept
 {
   // XXX EINVAL, EAGAIN, EBUSY
   return !__gthread_recursive_mutex_trylock(_M_mutex);
@@ -464,7 +464,7 @@
 public:
   typedef 

Re: Cgraph thunk reorg

2011-05-25 Thread David Edelsohn
Honza,

After we debugged this offline, I assume that you applied a version of
the patch to trunk?

Thanks, David

On Fri, May 13, 2011 at 3:14 PM, David Edelsohn dje@gmail.com wrote:
 Honza,

 Testing is not complete, but testcases that failed with DECL_ONE_ONLY
 error now are passing with the later version of the patch you sent.

 - David

 On Fri, May 13, 2011 at 10:49 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 please also try this patch

 Index: ipa.c
 ===
 --- ipa.c       (revision 173723)
 +++ ipa.c       (working copy)
 @@ -886,6 +886,9 @@ function_and_variable_visibility (bool w
          while (decl_node-thunk.thunk_p)
            decl_node = decl_node-callees-callee;

 +          DECL_COMDAT_GROUP (node-decl) = DECL_COMDAT_GROUP 
 (decl_node-decl);
 +         DECL_COMDAT (node-decl) = DECL_COMDAT (decl_node-decl);
 +
          /* Thunks have the same visibility as function they are attached to.
             For some reason C++ frontend don't seem to care. I.e. in
             g++.dg/torture/pr41257-2.C the thunk is not comdat while function
 @@ -893,10 +896,8 @@ function_and_variable_visibility (bool w

             We also need to arrange the thunk into the same comdat group as
             the function it reffers to.  */
 -         if (DECL_COMDAT (decl_node-decl))
 +         if (DECL_ONE_ONLY (decl_node-decl))
            {
 -             DECL_COMDAT (node-decl) = 1;
 -             DECL_COMDAT_GROUP (node-decl) = DECL_COMDAT_GROUP 
 (decl_node-decl);
              if (!node-same_comdat_group)
                {
                  node-same_comdat_group = decl_node;




Re: Fix PR 49014

2011-05-25 Thread Bernd Schmidt
On 05/25/2011 08:21 AM, Andrey Belevantsev wrote:
 Vlad, Bernd, I wonder if we can avoid having recog_memoized =0 insns
 that do not have proper DFA reservations (that is, they do not change
 the DFA state).  I see that existing practice allows this as shown by
 Bernd's patch to 48403, i.e. such insns do not count against
 issue_rate.  I would be happy to fix sel-sched in the same way. 
 However, both sel-sched ICEs as shown by PRs 48143 and 49014 really
 uncover the latent bugs in the backend.  So, is it possible to stop
 having such insns if scheduling is desired, or otherwise distinguish the
 insns that wrongly miss the proper DFA reservation?

Add a bool target podhook, targetm.sched.all_insns_have_reservations,
and add an assert in the scheduler if it is true.

I'm not sure what a good default value would be. Defining it to true
would almost certainly break a few ports initially (even assuming we
override it in sh where it's known not to be true), but I guess it such
an assertion failure would be useful information for the target maintainers.

Or, if we want to enable extra checking on ports where not all insns
have a reservation, a new insn attribute (has_reservation) could be
defined, defined to evaluate to true by default in genattrtab, and
(set_attr has_reservation 0) added in the machine descriptions where
necessary.


Bernd


Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 7:00 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, May 24, 2011 at 5:54 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression?

 If this is the case, then please change q_regs_operand predicate to
 accept just QI_REG_P registers.


I thought about it.  It is a problem only with %[abcd]h.  I am not sure if
changing q_regs_operand to  accept just QI_REG_P registers will negatively
impact

(define_peephole2
  [(set (reg FLAGS_REG) (match_operand 0  ))
   (set (match_operand:QI 1 register_operand )
(match_operator:QI 2 ix86_comparison_operator
  [(reg FLAGS_REG) (const_int 0)]))
   (set (match_operand 3 q_regs_operand )
(zero_extend (match_dup 1)))]
  (peep2_reg_dead_p (3, operands[1])
|| operands_match_p (operands[1], operands[3]))
! reg_overlap_mentioned_p (operands[3], operands[0])
  [(set (match_dup 4) (match_dup 0))
   (set (strict_low_part (match_dup 5))
(match_dup 2))]

(define_peephole2
  [(set (reg FLAGS_REG) (match_operand 0  ))
   (set (match_operand:QI 1 register_operand )
(match_operator:QI 2 ix86_comparison_operator
  [(reg FLAGS_REG) (const_int 0)]))
   (parallel [(set (match_operand 3 q_regs_operand )
   (zero_extend (match_dup 1)))
  (clobber (reg:CC FLAGS_REG))])]
  (peep2_reg_dead_p (3, operands[1])
|| operands_match_p (operands[1], operands[3]))
! reg_overlap_mentioned_p (operands[3], operands[0])
  [(set (match_dup 4) (match_dup 0))
   (set (strict_low_part (match_dup 5))
(match_dup 2))]


-- 
H.J.


Re: [RFA] [PR44618] [PowerPC] Wrong code for -frename-registers

2011-05-25 Thread David Edelsohn
On Mon, May 23, 2011 at 5:53 PM, edmar ed...@freescale.com wrote:
 I completed re-testing everything.
 It turns out I cannot reproduce the original error on gcc-4.4 (rev 173968)
 So, I am submitting only the patch that I tested for gcc-4.5/4.6/4.7

 Regression tested for e500mc target on:
 4.5: Revision: 173928
 4.6: Revision: 173936
 trunk: Revision: 173966

 The patch gcc.fix_rnreg4 applies directly to 4.6, 4.7 (1 line offset),
 and 4.5 (-632 lines offset)

Are you re-asking for approval?

The patch is okay.

Thanks, David

P.S. Please include the ChangeLog entry inline in the email message
and attach the patch to the email if it is large.  No tar files.


Re: PATCH: Add pause intrinsic

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 7:36 AM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 01:34 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 3:26 AM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 There are read/load memory barrier, write/store memory barrier and 
 full/general
 memory barrier.  You can find them at

 http://www.kernel.org/doc/Documentation/memory-barriers.txt

 Should I include a pointer to it?

 No.  I know perfectly well what memory barriers are.  I'm not asking
 what full memory barrier means.

 What barrier instruction(s) does __builtin_ia32_pause() generate?
 All I see in the patch is rep; nop.  Is that really a full memory
 barrier?


It is a full memory barrier in the sense that compiler won't move
load/store across it.  It is intended for kernel.

-- 
H.J.


[Patch ARM] Actually generate vorn and vbic instructions.

2011-05-25 Thread Ramana Radhakrishnan

Hi,

A co-worker pointed out that we weren't generating vorn and vbic 
instructions for Neon and I had a look.


Tests are still running and will commit to trunk if there are no 
regressions.


cheers
Ramana

2011-05-25  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

* config/arm/neon.md (ornmode3_neon): Canonicalize not.
(orndi3_neon): Likewise.
(bicmode3_neon): Likewise.

2011-05-25  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

* gcc.target/arm/neon-vorn-vbic.c: New file.
Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md  (revision 174174)
+++ gcc/config/arm/neon.md  (working copy)
@@ -794,8 +794,8 @@
 
 (define_insn ornmode3_neon
   [(set (match_operand:VDQ 0 s_register_operand =w)
-   (ior:VDQ (match_operand:VDQ 1 s_register_operand w)
-(not:VDQ (match_operand:VDQ 2 s_register_operand w]
+   (ior:VDQ (not:VDQ (match_operand:VDQ 2 s_register_operand w))
+(match_operand:VDQ 1 s_register_operand w)))]
   TARGET_NEON
   vorn\t%V_reg0, %V_reg1, %V_reg2
   [(set_attr neon_type neon_int_1)]
@@ -803,8 +803,8 @@
 
 (define_insn orndi3_neon
   [(set (match_operand:DI 0 s_register_operand =w,?=r,?r)
-   (ior:DI (match_operand:DI 1 s_register_operand w,r,0)
-(not:DI (match_operand:DI 2 s_register_operand w,0,r]
+   (ior:DI (not:DI (match_operand:DI 2 s_register_operand w,0,r))
+   (match_operand:DI 1 s_register_operand w,r,0)))]
   TARGET_NEON
   @
vorn\t%P0, %P1, %P2
@@ -816,8 +816,8 @@
 
 (define_insn bicmode3_neon
   [(set (match_operand:VDQ 0 s_register_operand =w)
-   (and:VDQ (match_operand:VDQ 1 s_register_operand w)
- (not:VDQ (match_operand:VDQ 2 s_register_operand w]
+   (and:VDQ (not:VDQ (match_operand:VDQ 2 s_register_operand w))
+(match_operand:VDQ 1 s_register_operand w)))]
   TARGET_NEON
   vbic\t%V_reg0, %V_reg1, %V_reg2
   [(set_attr neon_type neon_int_1)]
--- /dev/null   2011-05-18 14:49:12.916256701 +0100
+++ ./gcc/testsuite/gcc.target/arm/neon-vorn-vbic.c 2011-05-25 
11:17:09.966726432 +0100
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options -O2 -ftree-vectorize } */
+/* { dg-add-options arm_neon } */
+
+void bor (int *__restrict__ c, int *__restrict__ a, int *__restrict__ b)
+{
+  int i;
+  for (i=0;i9;i++)
+c[i] = b[i] | (~a[i]);
+}
+void bic (int *__restrict__ c, int *__restrict__ a, int *__restrict__ b)
+{
+  int i;
+  for (i=0;i9;i++)
+c[i] = b[i]  (~a[i]);
+}
+
+/* { dg-final { scan-assembler vorn\\t } } */
+/* { dg-final { scan-assembler vbic\\t } } */


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 4:47 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, May 25, 2011 at 7:36 AM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 01:34 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 3:26 AM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 There are read/load memory barrier, write/store memory barrier and 
 full/general
 memory barrier.  You can find them at

 http://www.kernel.org/doc/Documentation/memory-barriers.txt

 Should I include a pointer to it?

 No.  I know perfectly well what memory barriers are.  I'm not asking
 what full memory barrier means.

 What barrier instruction(s) does __builtin_ia32_pause() generate?
 All I see in the patch is rep; nop.  Is that really a full memory
 barrier?


 It is a full memory barrier in the sense that compiler won't move
 load/store across it.  It is intended for kernel.

There is no such thing if you include accesses to automatic variables.

Richard.

 --
 H.J.



Re: PATCH: Add pause intrinsic

2011-05-25 Thread Andrew Haley
On 05/25/2011 03:47 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 7:36 AM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 01:34 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 3:26 AM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 There are read/load memory barrier, write/store memory barrier and 
 full/general
 memory barrier.  You can find them at

 http://www.kernel.org/doc/Documentation/memory-barriers.txt

 Should I include a pointer to it?

 No.  I know perfectly well what memory barriers are.  I'm not asking
 what full memory barrier means.

 What barrier instruction(s) does __builtin_ia32_pause() generate?
 All I see in the patch is rep; nop.  Is that really a full memory
 barrier?
 
 It is a full memory barrier in the sense that compiler won't move
 load/store across it.  It is intended for kernel.

Right, so it is, in fact, not a full memory barrier.  I thought not.
I's no more a full memory barrier than a simple asm volatile() .
The doc needs to explain that a bit better.

Andrew.


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 4:54 PM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 03:47 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 7:36 AM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 01:34 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 3:26 AM, Andrew Haley a...@redhat.com wrote:
 On 05/24/2011 07:28 PM, H.J. Lu wrote:

 This patch implements pause intrinsic suggested by Andi.  OK
 for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 There are read/load memory barrier, write/store memory barrier and 
 full/general
 memory barrier.  You can find them at

 http://www.kernel.org/doc/Documentation/memory-barriers.txt

 Should I include a pointer to it?

 No.  I know perfectly well what memory barriers are.  I'm not asking
 what full memory barrier means.

 What barrier instruction(s) does __builtin_ia32_pause() generate?
 All I see in the patch is rep; nop.  Is that really a full memory
 barrier?

 It is a full memory barrier in the sense that compiler won't move
 load/store across it.  It is intended for kernel.

 Right, so it is, in fact, not a full memory barrier.  I thought not.
 I's no more a full memory barrier than a simple asm volatile() .
 The doc needs to explain that a bit better.

asm volatile ( : : : memory) in fact will work as a full memory barrier
because we are very very lazy in disambiguating against asms
(but that should change, at least a tiny bit).  Function calls otoh are
pretty well optimized.

Richard.

 Andrew.



Re: PATCH: Add pause intrinsic

2011-05-25 Thread Andrew Haley
On 05/25/2011 03:57 PM, Richard Guenther wrote:
 
 asm volatile ( : : : memory) in fact will work as a full memory barrier

How?  You surely need MFENCE or somesuch, unless all you
care about is a compiler barrier.  That's what I think needs
to be clarified.

Andrew.


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 5:09 PM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 03:57 PM, Richard Guenther wrote:

 asm volatile ( : : : memory) in fact will work as a full memory barrier

 How?  You surely need MFENCE or somesuch, unless all you
 care about is a compiler barrier.  That's what I think needs
 to be clarified.

Well, yes, I'm talking about the compiler memory barrier.

Richard.

 Andrew.



Re: [testsuite] ignore irrelevant warning in two ARM tests

2011-05-25 Thread Janis Johnson
On 05/24/2011 05:49 PM, Mike Stump wrote:
 On May 24, 2011, at 3:42 PM, Janis Johnson wrote:
 Is this one OK for trunk and 4.6?  The failure occurs for arm-none-eabi
 and for arm-none-linux-gnueabi.

 You should repeat all the original options from the main dg-options line, 
 with -Wno-abi added, in the ARM EABI dg-options line, since only one 
 dg-options line will be in effect.

 Oops, yet again.  I'll do that.
 
 Ok with that change.  Also, if there are many of these exceptions, it might 
 be better to add the flags to shut it up to the base set of flags, and then 
 to add it explicitly to any testcase that really does want to test the 
 warning.

These are the only tests I've found that get this message.

Janis



Re: PATCH: Add pause intrinsic

2011-05-25 Thread Michael Matz
Hi,

On Wed, 25 May 2011, Richard Guenther wrote:

  asm volatile ( : : : memory) in fact will work as a full memory 
  barrier
 
  How?  You surely need MFENCE or somesuch, unless all you care about is 
  a compiler barrier.  That's what I think needs to be clarified.
 
 Well, yes, I'm talking about the compiler memory barrier.

Something that we conventionally call optimization barrier :)  memory 
barrier has a fixed meaning which we shouldn't use in this case, it's 
confusing.


Ciao,
Michael.

Re: PATCH: Add pause intrinsic

2011-05-25 Thread Richard Guenther
On Wed, May 25, 2011 at 5:20 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Wed, 25 May 2011, Richard Guenther wrote:

  asm volatile ( : : : memory) in fact will work as a full memory
  barrier
 
  How?  You surely need MFENCE or somesuch, unless all you care about is
  a compiler barrier.  That's what I think needs to be clarified.

 Well, yes, I'm talking about the compiler memory barrier.

 Something that we conventionally call optimization barrier :)  memory
 barrier has a fixed meaning which we shouldn't use in this case, it's
 confusing.

Sure ;)

And to keep the info in a suitable thread what I'd like to improve here
is to make us disambiguate memory loads/stores against asms that
have no memory outputs/inputs.

Richard.


Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread Uros Bizjak
On Wed, May 25, 2011 at 4:42 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, May 25, 2011 at 7:00 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, May 24, 2011 at 5:54 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression? and Replace
   q_regs_operand with QIreg_operand.
   (

 If this is the case, then please change q_regs_operand predicate to
 accept just QI_REG_P registers.


 I thought about it.  It is a problem only with %[abcd]h.  I am not sure if
 changing q_regs_operand to  accept just QI_REG_P registers will negatively
 impact

I see. The patch is OK then, but for consistency, please change the
predicate of *movqi_extv_1*movqi_extzv_2 as well. Oh, and the
register_operand check in type calculation can be removed.

Thanks,
Uros.


Re: [SPARC] Disable -fira-share-save-slots by default

2011-05-25 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/24/11 16:30, Eric Botcazou wrote:
 The new save slot sharing algorithm has a documented limitation:
 
Future work:
 
  In the fallback case we should iterate backwards across all possible
  modes for the save, choosing the largest available one instead of
  falling back to the smallest mode immediately.  (eg TF - DF - SF).
That's not new -- I wrote that circa 1992/1993.  The whole point behind
the changes from ~1992 was to try and use DFmode insns when
caller-saving FP regs on the sparc.  However, I didn't think it was
worth the effort to deal with that case (TF - DF - SF).  The code
tries to build a save/restore insn of MOVE_MAX_WORDS size and if that
fails, then it drops back to WORD_SIZE IIRC.






 that is annoying for the SPARC when it comes to floating-point code because 
 the 
 floating-point registers are single (SF) but there is a fully-fledged support 
 for double (DF) arithmetics in the architecture.  So saving registers on an 
 individual basis really pessimizes here.  For example, the size of the object 
 generated for the Ada unit a-nlcefu.ads at -O2 decreases from 96080 to 95088 
 bytes when you pass -fno-ira-share-save-slots.
It's the new slot sharing code that doesn't have support for saving
larger hunks.  Having written the original code to handle larger saves
specifically to help sparc, I can certainly understand why the new code
is causing you grief :-)



 
 Experiments have shown that the impact on integer code is null in terms of 
 code 
 size and negligible in terms of stack usage (-fstack-usage reports 8/16 bytes 
 increase for most functions).  Therefore this patch disables the option by 
 default for the SPARC.  Boostrapped/regtested on SPARC/Solaris, applied on 
 the 
 mainline and 4.6 branch.
 
 
 Jeff, I'd like to apply it to the 4.5 branch as well, but I need your patch:
 
 2011-01-21  Jeff Law  l...@redhat.com
 
   PR rtl-optimization/41619
   * caller-save.c (setup_save_areas): Break out code to determine
   which hard regs are live across calls by examining the reload chains
   so that it is always used.
   Eliminate code which checked REG_N_CALLS_CROSSED.
 
 Do you have any objections to me backporting it to the branch?
No objections at all.  I don't believe there were any follow-up patches
and all the change did was make more of the paths through caller-save
consistent in how they determined what needed to be saved.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJN3SihAAoJEBRtltQi2kC7bnsH/10X0fDwfHtt8b16js5nZHZ8
n+f6TPlAuAu1vJ5h4YI7afybMMfBfHAbvTLwtD+f37boreTQU1wizVH4JLC4GgMS
KP9vB48mK/wHli0Hze37QAxVcQt8CPCr3d1fJtpVp6CNUp1gzLWkqT2GjUmxTxfX
M3qQ7wRot0cfvVDx8upOj3Yr9tih/c/vIm5ez49s8fzha2acSpEB0vFFj3gcx3EO
C3Mgu6z1ZVskIP5KOUIV/2EhtXHMoC4dxsodurfvtGafK5gmbaqVSipzZlKj4BSg
Oc4XPKAy07/cSxQhx94pYFB8+Jr7TC99Yubgq2v2gJitf+99AW9MCWmnEvLfX2A=
=w6DV
-END PGP SIGNATURE-


Re: RFA PR 48770

2011-05-25 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/25/11 06:40, Bernd Schmidt wrote:


 The code in question is literally 20 years old and predates running any
 real dead code elimination after reload.  ISTM the right thing to do is
 stop using delete_dead_insn in this code and let the post-reload DCE
 pass do its job.  That allows us to continue to record the block local
 equivalence.
 
 Sounds like the right thing to do. OK. (Can we eliminate the other caller?)
I didn't look too hard at the other call; looking at it now, I think we
can probably safely remove it and just delete the single insn which sets
the eliminable register.

I can either add that to the existing patch or submit it as a follow-up.
 I've got no strong preference on this issue.

 
 I've looked at code generation; it appears unchanged on i686-linux,
 which I think is the expected result. There are minor differences in
 assembly output on mips64-linux. If you want to look at it, I'm
 attaching a testcase - compile with -O2 -fno-reorder-blocks.
I'm a little surprised to hear there is a codegen difference, though I
can envision a variety of ways that could happen.  The undeleted insns
might interfere with the post-reload optimizers which run before dce for
example.  I'll take a quick look.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJN3StDAAoJEBRtltQi2kC7kGAH/R7DzEkkQdaNj6xQjTXtqKs5
hv9mngz5lEovhaZvpdmRw8pc4mBcis1P4s9jgD3boj1aX3R8PQu+WsL6br5DzduA
b+TtRDyVPazOSrc1mMLiCZr81rbSQfEzCWBWK1ZHLPA2oQNw8v211HtPoTxg1qsq
kXyArAnd/bQBip9AJHEh1J3yOyFkV5eNDODZPIl8hvGhIyRlJz+R72v3eRwT+oCA
65mU1Zfqykul+BKtJG1uj13gtTsroxHjZYI/iCmVMYriDFWIyj7qLgNtNOxx9yTQ
sFQbJqJX9cdXIgcAJoijzpT+bLubSeGUaWgjZgqG/AwU5vEXkOp8etBGeUZNg2Q=
=bSzC
-END PGP SIGNATURE-


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-05-25 Thread Xinliang David Li
Ping. The link to the message:

http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html

Thanks,

David

On Sun, May 22, 2011 at 4:17 PM, Xinliang David Li davi...@google.com wrote:
 Ping.

 David


 On Fri, May 20, 2011 at 9:06 AM, Xinliang David Li davi...@google.com wrote:
 Ok to check in this one?

 Thanks,

 David

 On Wed, May 18, 2011 at 12:30 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Wed, 18 May 2011, David Li wrote:

 +      error (Unrecognized option %s, is_enable ? -fenable : 
 -fdisable);

 +      error (Unknown pass %s specified in %s,
 +          phase_name,
 +          is_enable ? -fenable : -fdisable);

 Follow GNU Coding Standards for diagnostics (start with lowercase letter).

 +      inform (UNKNOWN_LOCATION, %s pass %s for functions in the range of 
 [%u, %u]\n,
 +              is_enable? Enable:Disable, phase_name, 
 new_range-start, new_range-last);

 Use separate calls to inform for the enable and disable cases, so that
 full sentences can be extracted for translation.

 +           error (Invalid range %s in option %s,
 +                  one_range,
 +                  is_enable ? -fenable : -fdisable);

 GNU Coding Standards.

 +               error (Invalid range %s in option %s,

 Likewise.

 +          inform (UNKNOWN_LOCATION, %s pass %s for functions in the 
 range of [%u, %u]\n,
 +                  is_enable? Enable:Disable, phase_name, 
 new_range-start, new_range-last);

 Again needs GCS and i18n fixes.

 --
 Joseph S. Myers
 jos...@codesourcery.com





Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 8:30 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Wed, May 25, 2011 at 4:42 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, May 25, 2011 at 7:00 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, May 24, 2011 at 5:54 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression? and Replace
       q_regs_operand with QIreg_operand.
       (

 If this is the case, then please change q_regs_operand predicate to
 accept just QI_REG_P registers.


 I thought about it.  It is a problem only with %[abcd]h.  I am not sure if
 changing q_regs_operand to  accept just QI_REG_P registers will negatively
 impact

 I see. The patch is OK then, but for consistency, please change the
 predicate of *movqi_extv_1*movqi_extzv_2 as well. Oh, and the
 register_operand check in type calculation can be removed.

 Thanks,
 Uros.


This is what I checked in.

Thanks.

-- 
H.J.
---
2011-05-25  H.J. Lu  hongjiu...@intel.com

PR target/49142
* config/i386/i386.md (*movqi_extv_1_rex64): Remove
register_operand check and replace q_regs_operand with
QIreg_operand in type calculation.
(*movqi_extv_1): Likewise.
(*movqi_extzv_2_rex64): Likewise.
(*movqi_extzv_2): Likewise.

* config/i386/predicates.md (QIreg_operand): New.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 49f1ee7..3b59024 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2487,10 +2487,9 @@
 }
 }
   [(set (attr type)
- (if_then_else (and (match_operand:QI 0 register_operand )
-   (ior (not (match_operand:QI 0 q_regs_operand ))
-(ne (symbol_ref TARGET_MOVX)
-(const_int 0
+ (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
+   (ne (symbol_ref TARGET_MOVX)
+   (const_int 0)))
(const_string imovx)
(const_string imov)))
(set (attr mode)
@@ -2514,10 +2513,9 @@
 }
 }
   [(set (attr type)
- (if_then_else (and (match_operand:QI 0 register_operand )
-   (ior (not (match_operand:QI 0 q_regs_operand ))
-(ne (symbol_ref TARGET_MOVX)
-(const_int 0
+ (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
+   (ne (symbol_ref TARGET_MOVX)
+   (const_int 0)))
(const_string imovx)
(const_string imov)))
(set (attr mode)
@@ -2552,7 +2550,7 @@
 }
 }
   [(set (attr type)
- (if_then_else (ior (not (match_operand:QI 0 q_regs_operand ))
+ (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
(ne (symbol_ref TARGET_MOVX)
(const_int 0)))
(const_string imovx)
@@ -2579,10 +2577,9 @@
 }
 }
   [(set (attr type)
- (if_then_else (and (match_operand:QI 0 register_operand )
-   (ior (not (match_operand:QI 0 q_regs_operand ))
-(ne (symbol_ref TARGET_MOVX)
-(const_int 0
+ (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
+   (ne (symbol_ref TARGET_MOVX)
+   (const_int 0)))
(const_string imovx)
(const_string imov)))
(set (attr mode)
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 8a89f70..1471f5a 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -82,6 +82,10 @@
   (and (match_code reg)
(match_test REGNO (op) == FLAGS_REG)))

+;; Return true if op is one of QImode registers: %[abcd][hl].
+(define_predicate QIreg_operand
+  (match_test QI_REG_P (op)))
+
 ;; Return true if op is a QImode register operand other than
 ;; %[abcd][hl].
 (define_predicate ext_QIreg_operand


Re: RFA PR 48770

2011-05-25 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/25/11 06:40, Bernd Schmidt wrote:

 I've looked at code generation; it appears unchanged on i686-linux,
 which I think is the expected result. There are minor differences in
 assembly output on mips64-linux. If you want to look at it, I'm
 attaching a testcase - compile with -O2 -fno-reorder-blocks.
I get the same code with and without the patch using a cross compiler.
Can you send me the differing .s files and dumps?

Thanks,
jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJN3S7+AAoJEBRtltQi2kC7ZXAH/2utYd98C+K0DcySDvk5wR3s
6yLaYiD2rkFlEKlXTi0ojNIGi87xLwjo8PUDs+lsQy4UONoCDPAbYA7/fX412pCY
IgOZqE/lcSsxNj7Mo6ggtobmnsDSShhn3SjnjA5NPbOGL77nfdGAzjtPJ3R9QmuD
nPjBKGzxkiM2W7bCDwPYQvuZpJ8M3YxDrrmAferYbrgu9/+QjS+qsg50ckahFgMe
l5VdWs+rm1bLym5R2DCqkG5b0ebVzvh7mg8dIDVD/FMonqjLOlzqSODbuLi+Qe/j
AQMUawoQMVlowQKtXaVAviP2VPp4V5oV7e8cdGBXO4XiShawqHKZn9/Zf+9YEt0=
=ZRqD
-END PGP SIGNATURE-


Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 6:52 AM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 05/25/2011 01:45 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 6:42 AM, Bernd Schmidt ber...@codesourcery.com 
 wrote:
 On 05/25/2011 01:37 PM, H.J. Lu wrote:

 I think it may have caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49160

 Looks like it. Not quite sure how to fix it yet. Do you know what files
 such as i386/64/_divtc3.c are trying to achieve?


 It provides backward compatibility with symbol versioning:

 [hjl@gnu-4 64]$ readelf -s /lib64/libgcc_s.so.1| grep __powitf2
     52: 003e8a80d170   167 FUNC    GLOBAL DEFAULT   12 
 __powitf2@@GCC_4.3.0
     54: 003e8a80d170   167 FUNC    GLOBAL DEFAULT   12 
 __powitf2@GCC_4.0.0
 [hjl@gnu-4 64]$

 That leaves me as clueless as before. Why does i386/64 need this but not
 other targets (such as i386/32), and why only those three functions
 (from the ones in libgcc)?

 Anyhow, below is one possible way of fixing it.

It fixed the libgcc failure.  Can you check it in?

Thanks.

-- 
H.J.


Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-25 Thread Bernd Schmidt
On 05/25/2011 04:38 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 6:52 AM, Bernd Schmidt ber...@codesourcery.com 
 wrote:
 Anyhow, below is one possible way of fixing it.
 
 It fixed the libgcc failure.  Can you check it in?

I suppose it is reasonably obvious. Done.


Bernd


Re: [PATCH PING] unreviewed tree-slimming patches

2011-05-25 Thread Joseph S. Myers
On Wed, 25 May 2011, Nathan Froyd wrote:

 These patches:
 
   (C, C++, middle-end)
   [PATCH 14/18] move TS_STATEMENT_LIST to be a substructure of TS_TYPED
   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00560.html
 
   (C, Java, middle-end)
   [PATCH 18/18] make TS_BLOCK a substructure of TS_BASE
   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00564.html
 
 are still pending review.  Jason commented on the TS_STATEMENT_LIST patch, but

The C changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-05-25 Thread Joseph S. Myers
On Wed, 25 May 2011, Xinliang David Li wrote:

 Ping. The link to the message:
 
 http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html

I don't consider this an option handling patch.  Patches adding whole new 
features involving new options should be reviewed by maintainers for the 
part of the compiler relevant to those features (since there isn't a pass 
manager maintainer, I guess that means middle-end).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PATCH: Add pause intrinsic

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 9:43 AM, Andrew Haley a...@redhat.com wrote:
 On 05/25/2011 04:32 PM, H.J. Lu wrote:
 On Wed, May 25, 2011 at 8:27 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, May 25, 2011 at 5:20 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Wed, 25 May 2011, Richard Guenther wrote:

 asm volatile ( : : : memory) in fact will work as a full memory
 barrier

 How?  You surely need MFENCE or somesuch, unless all you care about is
 a compiler barrier.  That's what I think needs to be clarified.

 Well, yes, I'm talking about the compiler memory barrier.

 Something that we conventionally call optimization barrier :)  memory
 barrier has a fixed meaning which we shouldn't use in this case, it's
 confusing.

 Sure ;)

 And to keep the info in a suitable thread what I'd like to improve here
 is to make us disambiguate memory loads/stores against asms that
 have no memory outputs/inputs.


 Please let me know how I should improve the document,

 Compiler memory barrier seems to be well-understood.  I suggest

 +Generates the @code{pause} machine instruction with a compiler memory 
 barrier.

 It's clear enough.

 Andrew.


I checked in this.

Thanks.


-- 
H.J.
---
Index: doc/extend.texi
===
--- doc/extend.texi (revision 174216)
+++ doc/extend.texi (working copy)
@@ -8699,7 +8699,8 @@ The following built-in function is alway

 @table @code
 @item void __builtin_ia32_pause (void)
-Generates the @code{pause} machine instruction with full memory barrier.
+Generates the @code{pause} machine instruction with a compiler memory
+barrier.
 @end table

 The following floating point built-in functions are made available in the
Index: ChangeLog
===
--- ChangeLog   (revision 174216)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2011-05-25  H.J. Lu  hongjiu...@intel.com
+
+   * doc/extend.texi (X86 Built-in Functions): Update pause
+   intrinsic.
+
 2011-05-25  Bernd Schmidt  ber...@codesourcery.com

PR bootstrap/49160


[PATCH] Fix VRP switch handling (PR tree-optimization/49161)

2011-05-25 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled, because there are multiple
CASE_LABELs for the same target bb in a switch:
bb 2:
  switch (x_1(D)) default: L13, case 3: l4, case 4: l1, case 6: l3

l3:
  bar (-1);

l2:
l1:
l4:
  bar (0);

find_switch_asserts sorts by uids of CASE_LABELs and adds x_1(D) == 4
as well as x_1(D) == 3 assertions on the same edge, instead of
adding properly x_1(D) = 3 and x_1(D) = 4 assertions.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.6?

2011-05-25  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/49161
* tree-vrp.c (struct case_info): New type.
(compare_case_labels): Sort case_info structs instead of
trees, and not primarily by CASE_LABEL uids but by
label_for_block indexes.
(find_switch_asserts): Put case labels into struct case_info
array instead of TREE_VEC, adjust sorting, compare label_for_block
values instead of CASE_LABELs.

* gcc.c-torture/execute/pr49161.c: New test.

--- gcc/tree-vrp.c.jj   2011-05-20 08:14:08.0 +0200
+++ gcc/tree-vrp.c  2011-05-25 16:03:18.0 +0200
@@ -4673,28 +4673,35 @@ find_conditional_asserts (basic_block bb
   return need_assert;
 }
 
-/* Compare two case labels sorting first by the destination label uid
+struct case_info
+{
+  tree expr;
+  basic_block bb;
+};
+
+/* Compare two case labels sorting first by the destination bb index
and then by the case value.  */
 
 static int
 compare_case_labels (const void *p1, const void *p2)
 {
-  const_tree const case1 = *(const_tree const*)p1;
-  const_tree const case2 = *(const_tree const*)p2;
-  unsigned int uid1 = DECL_UID (CASE_LABEL (case1));
-  unsigned int uid2 = DECL_UID (CASE_LABEL (case2));
+  const struct case_info *ci1 = (const struct case_info *) p1;
+  const struct case_info *ci2 = (const struct case_info *) p2;
+  int idx1 = ci1-bb-index;
+  int idx2 = ci2-bb-index;
 
-  if (uid1  uid2)
+  if (idx1  idx2)
 return -1;
-  else if (uid1 == uid2)
+  else if (idx1 == idx2)
 {
   /* Make sure the default label is first in a group.  */
-  if (!CASE_LOW (case1))
+  if (!CASE_LOW (ci1-expr))
return -1;
-  else if (!CASE_LOW (case2))
+  else if (!CASE_LOW (ci2-expr))
return 1;
   else
-return tree_int_cst_compare (CASE_LOW (case1), CASE_LOW (case2));
+   return tree_int_cst_compare (CASE_LOW (ci1-expr),
+CASE_LOW (ci2-expr));
 }
   else
 return 1;
@@ -4715,8 +4722,8 @@ find_switch_asserts (basic_block bb, gim
   gimple_stmt_iterator bsi;
   tree op;
   edge e;
-  tree vec2;
-  size_t n = gimple_switch_num_labels(last);
+  struct case_info *ci;
+  size_t n = gimple_switch_num_labels (last);
 #if GCC_VERSION = 4000
   unsigned int idx;
 #else
@@ -4731,36 +4738,38 @@ find_switch_asserts (basic_block bb, gim
 return false;
 
   /* Build a vector of case labels sorted by destination label.  */
-  vec2 = make_tree_vec (n);
+  ci = XNEWVEC (struct case_info, n);
   for (idx = 0; idx  n; ++idx)
-TREE_VEC_ELT (vec2, idx) = gimple_switch_label (last, idx);
-  qsort (TREE_VEC_ELT (vec2, 0), n, sizeof (tree), compare_case_labels);
+{
+  ci[idx].expr = gimple_switch_label (last, idx);
+  ci[idx].bb = label_to_block (CASE_LABEL (ci[idx].expr));
+}
+  qsort (ci, n, sizeof (struct case_info), compare_case_labels);
 
   for (idx = 0; idx  n; ++idx)
 {
   tree min, max;
-  tree cl = TREE_VEC_ELT (vec2, idx);
+  tree cl = ci[idx].expr;
+  basic_block cbb = ci[idx].bb;
 
   min = CASE_LOW (cl);
   max = CASE_HIGH (cl);
 
   /* If there are multiple case labels with the same destination
 we need to combine them to a single value range for the edge.  */
-  if (idx + 1  n
-  CASE_LABEL (cl) == CASE_LABEL (TREE_VEC_ELT (vec2, idx + 1)))
+  if (idx + 1  n  cbb == ci[idx + 1].bb)
{
  /* Skip labels until the last of the group.  */
  do {
++idx;
- } while (idx  n
-   CASE_LABEL (cl) == CASE_LABEL (TREE_VEC_ELT (vec2, idx)));
+ } while (idx  n  cbb == ci[idx].bb);
  --idx;
 
  /* Pick up the maximum of the case label range.  */
- if (CASE_HIGH (TREE_VEC_ELT (vec2, idx)))
-   max = CASE_HIGH (TREE_VEC_ELT (vec2, idx));
+ if (CASE_HIGH (ci[idx].expr))
+   max = CASE_HIGH (ci[idx].expr);
  else
-   max = CASE_LOW (TREE_VEC_ELT (vec2, idx));
+   max = CASE_LOW (ci[idx].expr);
}
 
   /* Nothing to do if the range includes the default label until we
@@ -4769,7 +4778,7 @@ find_switch_asserts (basic_block bb, gim
continue;
 
   /* Find the edge to register the assert expr on.  */
-  e = find_edge (bb, label_to_block (CASE_LABEL (cl)));
+  e = find_edge (bb, cbb);
 
   /* Register the necessary assertions for the operand in the
 

Re: PATCH: Add pause intrinsic

2011-05-25 Thread Andrew Pinski
On Wed, May 25, 2011 at 10:19 AM, H.J. Lu hjl.to...@gmail.com wrote:
 --
 H.J.
 ---
 Index: doc/extend.texi
 ===
 --- doc/extend.texi     (revision 174216)
 +++ doc/extend.texi     (working copy)
 @@ -8699,7 +8699,8 @@ The following built-in function is alway

  @table @code
  @item void __builtin_ia32_pause (void)
 -Generates the @code{pause} machine instruction with full memory barrier.
 +Generates the @code{pause} machine instruction with a compiler memory
 +barrier.

What is the pause machine instruction do?  How is it different from a
normal nop?

Also pause to me means it waits for input or an interrupt.

Thanks,
Andrew Pinski


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Andrew Haley
On 05/25/2011 06:26 PM, Andrew Pinski wrote:
 On Wed, May 25, 2011 at 10:19 AM, H.J. Lu hjl.to...@gmail.com wrote:
 --
 H.J.
 ---
 Index: doc/extend.texi
 ===
 --- doc/extend.texi (revision 174216)
 +++ doc/extend.texi (working copy)
 @@ -8699,7 +8699,8 @@ The following built-in function is alway

  @table @code
  @item void __builtin_ia32_pause (void)
 -Generates the @code{pause} machine instruction with full memory barrier.
 +Generates the @code{pause} machine instruction with a compiler memory
 +barrier.
 
 What is the pause machine instruction do?

That's documented by Intel in the architecture manual.  Surely
we don't have to explain it all.

Andrew.


PAUSE—Spin Loop Hint

Improves the performance of spin-wait loops. When executing a “spin-wait loop,” 
a
Pentium 4 or Intel Xeon processor suffers a severe performance penalty when 
exiting
the loop because it detects a possible memory order violation. The PAUSE 
instruction
provides a hint to the processor that the code sequence is a spin-wait loop. The
processor uses this hint to avoid the memory order violation in most situations,
which greatly improves processor performance. For this reason, it is recommended
that a PAUSE instruction be placed in all spin-wait loops.

An additional function of the PAUSE instruction is to reduce the power consumed 
by
a Pentium 4 processor while executing a spin loop. The Pentium 4 processor can
execute a spin-wait loop extremely quickly, causing the processor to consume a 
lot of
power while it waits for the resource it is spinning on to become available. 
Inserting
a pause instruction in a spin-wait loop greatly reduces the processor’s power
consumption.

This instruction was introduced in the Pentium 4 processors, but is backward 
compat-
ible with all IA-32 processors. In earlier IA-32 processors, the PAUSE 
instruction
operates like a NOP instruction. The Pentium 4 and Intel Xeon processors 
implement
the PAUSE instruction as a pre-defined delay. The delay is finite and can be 
zero for
some processors. This instruction does not change the architectural state of the
processor (that is, it performs essentially a delaying no-op operation).
This instruction’s operation is the same in non-64-bit modes and 64-bit mode.


Re: More fixes from static analysis checkers

2011-05-25 Thread Jakub Jelinek
On Thu, Mar 24, 2011 at 03:52:57PM -0600, Jeff Law wrote:
 We had a variety of functions which would fail to call va_end prior to
 returning.  I'm not aware of a host were this could cause a problem, but
 it's easy enough to fix and keeps the checkers quiet.

In case of def_fn_type, this added a second va_end if the function
doesn't fail.
This patch removes the first va_end, bootstrapped/regtested on x86_64-linux
and i686-linux, committed as obvious.

2011-05-25  Jakub Jelinek  ja...@redhat.com

* c-common.c (def_fn_type): Remove extra va_end.

* gcc-interface/utils.c (def_fn_type): Remove extra va_end.

--- gcc/c-family/c-common.c.jj  2011-05-24 23:34:16.0 +0200
+++ gcc/c-family/c-common.c 2011-05-25 16:50:57.0 +0200
@@ -4451,7 +4451,6 @@ def_fn_type (builtin_type def, builtin_t
goto egress;
   args[i] = t;
 }
-  va_end (list);
 
   t = builtin_types[ret];
   if (t == error_mark_node)
--- gcc/ada/gcc-interface/utils.c.jj2011-05-11 19:38:55.0 +0200
+++ gcc/ada/gcc-interface/utils.c   2011-05-25 16:52:00.0 +0200
@@ -4965,7 +4965,6 @@ def_fn_type (builtin_type def, builtin_t
goto egress;
   args[i] = t;
 }
-  va_end (list);
 
   t = builtin_types[ret];
   if (t == error_mark_node)


Jakub


Re: [patch, ARM] Fix PR42017, LR not used in leaf functions

2011-05-25 Thread Chung-Lin Tang
On 2011/5/20 07:46 PM, Chung-Lin Tang wrote:
 On 2011/5/20 下午 07:41, Ramana Radhakrishnan wrote:
 On 17/05/11 14:10, Chung-Lin Tang wrote:
 On 2011/5/13 04:26 PM, Richard Sandiford wrote:
 Richard Sandifordrichard.sandif...@linaro.org  writes:
 Chung-Lin Tangclt...@codesourcery.com  writes:
 My fix here simply adds 'reload_completed' as an additional condition
 for EPILOGUE_USES to return true for LR_REGNUM. I think this should be
 valid, as correct LR save/restoring is handled by the
 epilogue/prologue
 code; it should be safe for IRA to treat it as a normal call-used
 register.

 FWIW, epilogue_completed might be a more accurate choice.

 I still stand by this, although I realise no other target does it.

 Did a re-test of the patch just to be sure, as expected the test results
 were also clear. Attached is the updated patch.

 Can you specify what you tested with this patch ?
 
 Native bootstrap success, plus C/C++ and libstdc++ tests. IIRC I also
 saw one or two FAIL-PASS in the results too (forgot specific testcases)
 

 So, it's interesting to note that the use of this was changed in 2007 by
 zadeck as a part of the df merge.

 I can't find the patch trail beyond this on the lists.

 http://gcc.gnu.org/viewvc/branches/dataflow-branch/gcc/config/arm/arm.h?r1=120281r2=121501


 It might be better to understand why this was done in the first place
 for the ARM port as part of the Dataflow bring up and why folks wanted
 to make this unconditional.

Digging through the repository, this is my explanation, FWIW:

1) The gen_prologue_uses() of LR were added back in Dec.2000 (r38467),
when ce2 was still the if-convert-after-reload pass, placed after
prologue-epilogue construction. (hence the arm_expand_prologue() comment
about preventing ce2 using LR)

2) if-conversion after combine was added in Oct.2002 (r58547), which
became the new ce2 (pre-reload); ifcvt after reload became ce3. The
comments in arm_expand_prologue() were not updated.

3) dataflow-branch work was circa 2007. RTL-ifcvt seemed to be updated
during this time, hence removal of the LR-uses in arm_expand_prologue()
seems reasonable. My guess here: ce2 was mistaken to be
ifcvt-after-combine (rather than the originally intended
ifcvt-after-reload, now ce3) by the comments; considering the
arm_expand_prologue() bits were updated, the comments may have been read
seriously.

4) Since ce2 was a pre-reload pass by then, the unconditionalizing of
EPILOGUE_USES was probably intended to be a supplemental change, to
support removing those gen_prologue_use()s.

I hope this is a reasonable explanation, but do note a lot of this is
guessing :)

I tried taking the last version of the dataflow-branch (circa 4.3) and
did cross-test run compares of EPILOGUE_USES with and without the
reload_completed conditionalization. The C testsuite results were clean.

The LR-not-used symptoms seem triggered by this EPILOGUE_USES change
since then. As the PR42017 submitter lists the affected GCC versions,
this regression has been present since post-4.2.

Given the above explanation, and considering that the tests on current
trunk are okay, plus we're in stage1 right now, is this
re-conditionalizing EPILOGUE_USES change okay to commit?

Thanks,
Chung-Lin


Re: Go patch committed: Update to current Go library

2011-05-25 Thread Rainer Orth
Ian Lance Taylor i...@google.com writes:

 I just committed a patch to godump.c which I think should fix this
 issue.  Let me know if it doesn't.

There are several issues now:

* While I get

// var ___iob [59+1]___FILE

  now, there's still

var __lastbuf *_FILE

  left, with commented

// type _FILE struct { _cnt int32; _ptr *uint8; _base *uint8; _flag uint8; _file
 uint8; __orientation INVALID-bit-field; __ionolock INVALID-bit-field; __seekabl
e INVALID-bit-field; __extendedfd INVALID-bit-field; __xf_nocheck INVALID-bit-fi
eld; __filler INVALID-bit-field; }

   as before.

* The amd64 sysinfo.go contains several undefined types:

sysinfo.go:2886:53: error: use of undefined type '_fpchip_state'
sysinfo.go:2886:40: error: struct field type is incomplete
sysinfo.go:2886:53: error: use of undefined type '_fpchip_state'
sysinfo.go:2887:47: error: struct field type is incomplete
sysinfo.go:2892:32: error: use of undefined type '_fxsave_state'
sysinfo.go:2892:24: error: struct field type is incomplete

type _fpu struct { fp_reg_set struct { fpchip_state _fpchip_state; }; }
type _fpregset_t struct { fp_reg_set struct { fpchip_state _fpchip_state; }; }
type __kfpu_u struct { kfpu_fx _fxsave_state; }
type _kfpu_t struct { kfpu_u __kfpu_u; kfpu_status uint32; kfpu_xstatus uint32; 
}

  Both types are commented, due to the use of commented _upad128_t:

// type _upad128_t struct { _q INVALID-float-80; }

// type _fpchip_state struct { cw uint16; sw uint16; fctw uint8; __fx_rsvd 
uint8; fop uint16; rip uint64; rdp uint64; mxcsr uint32; mxcsr_mask uint32; st 
[7+1]struct { fpr_16 [4+1]uint16; }; xmm [15+1]_upad128_t; __fx_ign2 
[5+1]_upad128_t; status uint32; xstatus uint32; }
// type _fxsave_state struct { fx_fcw uint16; fx_fsw uint16; fx_fctw uint16; 
fx_fop uint16; fx_rip uint64; fx_rdp uint64; fx_mxcsr uint32; fx_mxcsr_mask 
uint32; fx_st [7+1]struct { fpr_16 [4+1]uint16; }; fx_xmm [15+1]_upad128_t; 
__fx_ign2 [5+1]_upad128_t; }

  Unfortunately, this has as ripple effect and I need to omit several
  type declarations to make the problem go away:

+  grep -v '^type _fpu' | \
+  grep -v '^type _fpregset_t' | \
+  grep -v '^type _mcontext_t' | \
+  grep -v '^type _ucontext' | \
+  grep -v '^type __kfpu_u' | \
+  grep -v '^type _kfpu_t' | \

  sys/types.h has

typedef union {
long double _q;
uint32_t_l[4];
} upad128_t;

  I already have to provide a _upad128_t replacement for other uses, but
  it would really help to support this directly.

With those types and __lastbuf omitted from sysinfo.go, I can
successfully bootstrap on i386-pc-solaris2.1[01].  On Solaris 11/x86,
the libgo results are clean, on Solaris 10/x86 there are still 37
failures for the amd64 multilib which I still need to debug.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] More pow(x,c) expansions in cse_sincos pass (PR46728, patch 3)

2011-05-25 Thread William J. Schmidt
This patch adds logic to gimple_expand_builtin_pow () to optimize
pow(x,y), where y is one of 0.5, 0.25, 0.75, 1./3., or 1./6.  I noticed
that there were two missing calls to gimple_set_location () in my
previous patch, so I've corrected those here as well.

There's one TODO comment in this patch.  I don't believe the test for
TREE_SIDE_EFFECTS (arg0) should be necessary; but I'm not convinced it
was necessary in the code whence I copied it, either, so I left it in
for comment in case I'm misunderstanding something.


2011-05-25  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/46728
* tree-ssa-math-opts.c (powi_as_mults_1): Add gimple_set_location.
(powi_as_mults): Add gimple_set_location.
(build_and_insert_call): New.
(gimple_expand_builtin_pow): Add handling for pow(x,y) when y is
0.5, 0.25, 0.75, 1./3., or 1./6.

Index: gcc/tree-ssa-math-opts.c
===
--- gcc/tree-ssa-math-opts.c(revision 174199)
+++ gcc/tree-ssa-math-opts.c(working copy)
@@ -965,6 +965,7 @@ powi_as_mults_1 (gimple_stmt_iterator *gsi, locati
 }
 
   mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op0, op1);
+  gimple_set_location (mult_stmt, loc);
   gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT);
 
   return ssa_target;
@@ -999,6 +1000,7 @@ powi_as_mults (gimple_stmt_iterator *gsi, location
   div_stmt = gimple_build_assign_with_ops (RDIV_EXPR, target, 
   build_real (type, dconst1),
   result);
+  gimple_set_location (div_stmt, loc);
   gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT);
 
   return target;
@@ -1024,6 +1026,34 @@ gimple_expand_builtin_powi (gimple_stmt_iterator *
   return NULL_TREE;
 }
 
+/* Build a gimple call statement that calls FN with argument ARG.
+   Set the lhs of the call statement to a fresh SSA name for
+   variable VAR.  If VAR is NULL, first allocate it.  Insert the
+   statement prior to GSI's current position, and return the fresh
+   SSA name.  */
+
+static tree
+build_and_insert_call (gimple_stmt_iterator *gsi, tree fn, tree arg,
+  tree *var, location_t loc)
+{
+  gimple call_stmt;
+  tree ssa_target;
+
+  if (!*var)
+{
+  *var = create_tmp_var (TREE_TYPE (arg), powroot);
+  add_referenced_var (*var);
+}
+
+  call_stmt = gimple_build_call (fn, 1, arg);
+  ssa_target = make_ssa_name (*var, NULL);
+  gimple_set_lhs (call_stmt, ssa_target);
+  gimple_set_location (call_stmt, loc);
+  gsi_insert_before (gsi, call_stmt, GSI_SAME_STMT);
+
+  return ssa_target;
+}
+
 /* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC.  If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -1035,6 +1065,8 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
 {
   REAL_VALUE_TYPE c, cint;
   HOST_WIDE_INT n;
+  tree type, sqrtfn, target = NULL_TREE;
+  enum machine_mode mode;
 
   /* If the exponent isn't a constant, there's nothing of interest
  to be done.  */
@@ -1054,6 +1086,108 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
   powi_cost (n) = POWI_MAX_MULTS)))
 return gimple_expand_builtin_powi (gsi, loc, arg0, n);
 
+  /* Attempt various optimizations using sqrt and cbrt.  */
+  type = TREE_TYPE (arg0);
+  mode = TYPE_MODE (type);
+  sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
+
+  if (flag_unsafe_math_optimizations  sqrtfn != NULL_TREE)
+{
+  REAL_VALUE_TYPE dconst1_4, dconst3_4;
+  tree cbrtfn;
+  bool hw_sqrt_exists;
+
+  /* Optimize pow(x,0.5) = sqrt(x).  */
+  if (REAL_VALUES_EQUAL (c, dconsthalf))
+   return build_and_insert_call (gsi, sqrtfn, arg0, target, loc);
+
+  /* Optimize pow(x,0.25) = sqrt(sqrt(x)).  */
+  dconst1_4 = dconst1;
+  SET_REAL_EXP (dconst1_4, REAL_EXP (dconst1_4) - 2);
+  hw_sqrt_exists = optab_handler(sqrt_optab, mode) != CODE_FOR_nothing;
+
+  if (REAL_VALUES_EQUAL (c, dconst1_4)  hw_sqrt_exists)
+   {
+ tree sqrt_arg0;
+
+ /* sqrt(x)  */
+ sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, target, loc);
+
+ /* sqrt(sqrt(x))  */
+ return build_and_insert_call (gsi, sqrtfn, sqrt_arg0, target, loc);
+   }
+  
+  /* Optimize pow(x,0.75) = sqrt(x) * sqrt(sqrt(x)).  */
+  real_from_integer (dconst3_4, VOIDmode, 3, 0, 0);
+  SET_REAL_EXP (dconst3_4, REAL_EXP (dconst3_4) - 2);
+
+  if (optimize_function_for_speed_p (cfun)
+  !TREE_SIDE_EFFECTS (arg0) /* TODO: is this needed?  */
+  REAL_VALUES_EQUAL (c, dconst3_4)
+  hw_sqrt_exists)
+   {
+ tree sqrt_arg0, sqrt_sqrt, ssa_target;
+ gimple mult_stmt;
+
+ /* sqrt(x)  */
+ sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, target, loc);
+
+ /* sqrt(sqrt(x))  

[pph] Reformat (issue4515140)

2011-05-25 Thread Lawrence Crowl
In pph_stream_read_tree and pph_stream_write_tree, reformat for style.
This step was skipped in the last patch to make diffs more sensible.

Index: gcc/cp/ChangeLog.pph

2011-05-25  Lawrence Crowl  cr...@google.com

* pph-streamer-in.c (pph_stream_read_tree): Reformat for style.
* pph-streamer-out.c (pph_stream_write_tree): Reformat for style.


Index: gcc/cp/pph-streamer-in.c
===
--- gcc/cp/pph-streamer-in.c(revision 174166)
+++ gcc/cp/pph-streamer-in.c(working copy)
@@ -820,39 +820,33 @@ pph_stream_read_tree (struct lto_input_b
 case PARM_DECL:
 case USING_DECL:
 case VAR_DECL:
-   {
   /* FIXME pph: Should we merge DECL_INITIAL into lang_specific? */
   DECL_INITIAL (expr) = pph_input_tree (stream);
- pph_stream_read_lang_specific (stream, expr);
+  pph_stream_read_lang_specific (stream, expr);
   break;
-}
 
 case FUNCTION_DECL:
-{
   DECL_INITIAL (expr) = pph_input_tree (stream);
   pph_stream_read_lang_specific (stream, expr);
-   DECL_SAVED_TREE (expr) = pph_input_tree (stream);
+  DECL_SAVED_TREE (expr) = pph_input_tree (stream);
   break;
-   }
 
 case TYPE_DECL:
-{
   DECL_INITIAL (expr) = pph_input_tree (stream);
   pph_stream_read_lang_specific (stream, expr);
-   DECL_ORIGINAL_TYPE (expr) = pph_input_tree (stream);
+  DECL_ORIGINAL_TYPE (expr) = pph_input_tree (stream);
   break;
-}
 
 case STATEMENT_LIST:
-{
-  HOST_WIDE_INT i, num_trees = pph_input_uint (stream);
-  for (i = 0; i  num_trees; i++)
-   {
- tree stmt = pph_input_tree (stream);
- append_to_statement_list (stmt, expr);
-   }
+  {
+HOST_WIDE_INT i, num_trees = pph_input_uint (stream);
+for (i = 0; i  num_trees; i++)
+ {
+   tree stmt = pph_input_tree (stream);
+   append_to_statement_list (stmt, expr);
+ }
+  }
   break;
-}
 
 case ARRAY_TYPE:
 case BOOLEAN_TYPE:
@@ -870,62 +864,48 @@ pph_stream_read_tree (struct lto_input_b
 case REFERENCE_TYPE:
 case VECTOR_TYPE:
 case VOID_TYPE:
-{
   pph_stream_read_lang_type (stream, expr);
   break;
-}
 
 case QUAL_UNION_TYPE:
 case RECORD_TYPE:
 case UNION_TYPE:
-{
-  pph_stream_read_lang_type (stream, expr);
-{
-  TYPE_BINFO (expr) = pph_input_tree (stream);
-}
+  pph_stream_read_lang_type (stream, expr);
+  TYPE_BINFO (expr) = pph_input_tree (stream);
   break;
-}
 
 case OVERLOAD:
-{
   OVL_FUNCTION (expr) = pph_input_tree (stream);
   break;
-}
 
 case IDENTIFIER_NODE:
-{
-  struct lang_identifier *id = LANG_IDENTIFIER_CAST (expr);
-  id-namespace_bindings = pph_stream_read_cxx_binding (stream);
-  id-bindings = pph_stream_read_cxx_binding (stream);
-  id-class_template_info = pph_input_tree (stream);
-  id-label_value = pph_input_tree (stream);
+  {
+struct lang_identifier *id = LANG_IDENTIFIER_CAST (expr);
+id-namespace_bindings = pph_stream_read_cxx_binding (stream);
+id-bindings = pph_stream_read_cxx_binding (stream);
+id-class_template_info = pph_input_tree (stream);
+id-label_value = pph_input_tree (stream);
+  }
   break;
-}
 
 case BASELINK:
-{
   BASELINK_BINFO (expr) = pph_input_tree (stream);
   BASELINK_FUNCTIONS (expr) = pph_input_tree (stream);
   BASELINK_ACCESS_BINFO (expr) = pph_input_tree (stream);
   break;
-}
 
 case TEMPLATE_DECL:
-{
   DECL_INITIAL (expr) = pph_input_tree (stream);
   pph_stream_read_lang_specific (stream, expr);
   DECL_TEMPLATE_RESULT (expr) = pph_input_tree (stream);
   DECL_TEMPLATE_PARMS (expr) = pph_input_tree (stream);
   DECL_CONTEXT (expr) = pph_input_tree (stream);
   break;
-}
 
 case TEMPLATE_INFO:
-{
   TI_TYPEDEFS_NEEDING_ACCESS_CHECKING (expr)
   = pph_stream_read_qual_use_vec (stream);
   break;
-}
 
 case TREE_LIST:
 case TREE_BINFO:
Index: gcc/cp/pph-streamer-out.c
===
--- gcc/cp/pph-streamer-out.c   (revision 174166)
+++ gcc/cp/pph-streamer-out.c   (working copy)
@@ -821,45 +821,39 @@ pph_stream_write_tree (struct output_blo
 case PARM_DECL:
 case USING_DECL:
 case VAR_DECL:
-   {
   /* FIXME pph: Should we merge DECL_INITIAL into lang_specific? */
   pph_output_tree_or_ref_1 (stream, DECL_INITIAL (expr), ref_p, 3);
- pph_stream_write_lang_specific (stream, expr, ref_p);
+  pph_stream_write_lang_specific (stream, expr, ref_p);
   break;
-}
 
 case FUNCTION_DECL:
-{
   pph_output_tree_or_ref_1 (stream, DECL_INITIAL (expr), ref_p, 3);
   pph_stream_write_lang_specific (stream, expr, ref_p);
-  

[v3] Small tweak to std::random_device

2011-05-25 Thread Paolo Carlini

Hi,

committed to mainline.

Thanks,
Paolo.

/
2011-05-25  Paolo Carlini  paolo.carl...@oracle.com

* include/bits/random.h (random_device::min, max): Specify constexpr.
Index: include/bits/random.h
===
--- include/bits/random.h   (revision 174216)
+++ include/bits/random.h   (working copy)
@@ -1544,12 +1544,12 @@
 
 #endif
 
-result_type
-min() const
+static constexpr result_type
+min()
 { return std::numeric_limitsresult_type::min(); }
 
-result_type
-max() const
+static constexpr result_type
+max()
 { return std::numeric_limitsresult_type::max(); }
 
 double


Re: [testsuite] remove XFAIL for all but ia64 for g++.dg/tree-ssa/pr43411.C

2011-05-25 Thread Mike Stump
On May 25, 2011, at 1:38 AM, Rainer Orth wrote:
 Janis Johnson jani...@codesourcery.com writes:
 
 Archived test results for 4.7.0 for most processors with C++ results have:
 
 XPASS: g++.dg/tree-ssa/pr43411.C scan-tree-dump-not optimized OBJ_TYPE_REF
 
 The only failures I could find were for ia64-linux and ia64-hpux.  This
 patch changes the xfail so it only applies to ia64-*-*.  OK for trunk?
 
 Richard rejected a similar patch:

I see the two issues as orthogonal.  One issue it to have an accurate 
expectation for the actual testcase on actual targets.  The other is to modify 
the testcase to test something else.  While one can use the XPASS as a way of 
keeping track of the issue of improving the testcase, I'd rather approve the 
fix to fix the expected state and have people that want to track the other 
issue, instead of using XPASS to track that state, to use a PR instead.

I think it would be nice to go even farther, and that would be to set the 
expected state on all testcases on 6 platforms at the time of release, to 
expected, filing PRs for all failures (any unexpected result) so marked and to 
actually gate the release on no unexpected results.


Re: [PATCH PING] unreviewed tree-slimming patches

2011-05-25 Thread Tom Tromey
 Nathan == Nathan Froyd froy...@codesourcery.com writes:

Nathan   (C, Java, middle-end)
Nathan   [PATCH 18/18] make TS_BLOCK a substructure of TS_BASE
Nathan   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00564.html

The Java parts are ok.

I think these sorts of changes should be obvious once approved from a
middle-end perspective, at least assuming that there are no regressions.
I say this because I think that once the core change has been decided
on, there is often just one way to go about fixing up the users; at
least, in a case like this where the consequence amounts to deleting
assignments.

I mentioned this idea before but I didn't see any discussion of it.  I
am happy to continue looking at patches like this if that is what the
more active maintainers would prefer.

Tom


Re: [PATCH PING] unreviewed tree-slimming patches

2011-05-25 Thread Nathan Froyd
On 05/25/2011 02:06 PM, Tom Tromey wrote:
 Nathan == Nathan Froyd froy...@codesourcery.com writes:
 
 Nathan   (C, Java, middle-end)
 Nathan   [PATCH 18/18] make TS_BLOCK a substructure of TS_BASE
 Nathan   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00564.html
 
 The Java parts are ok.
 
 I think these sorts of changes should be obvious once approved from a
 middle-end perspective, at least assuming that there are no regressions.
 
 I mentioned this idea before but I didn't see any discussion of it.  I
 am happy to continue looking at patches like this if that is what the
 more active maintainers would prefer.

I think Jason mentioned considering them approved after waiting a week.  If we
want to enshrine that as policy, I think that'd be reasonable.  All in favor...?

-Nathan


Patch for libobjc/38307

2011-05-25 Thread Nicola Pero
I committed to trunk this libobjc patch by Richard Frith-Macdonald and David 
Ayers.
The patch fixes some rare (but serious) problems with +initialize in 
multithreading
programs.  It's complicated and I refer to

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38307

for more information.

I approved the patch, and committed it to trunk (as Richard doesn't have write 
access).

Richard/David, thanks a lot for your help! :-)

Index: sendmsg.c
===
--- sendmsg.c   (revision 174219)
+++ sendmsg.c   (working copy)
@@ -41,6 +41,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include objc/thr.h
 #include objc-private/module-abi-8.h
 #include objc-private/runtime.h
+#include objc-private/hash.h
 #include objc-private/sarray.h
 #include objc-private/selector.h /* For sel_is_mapped() */
 #include runtime-info.h
@@ -75,10 +76,14 @@ IMP (*__objc_msg_forward2) (id, SEL) = NULL;
 /* Send +initialize to class.  */
 static void __objc_send_initialize (Class);
 
-static void __objc_install_dispatch_table_for_class (Class);
+/* Forward declare some functions */
+static void __objc_install_dtable_for_class (Class cls);
+static void __objc_prepare_dtable_for_class (Class cls);
+static void __objc_install_prepared_dtable_for_class (Class cls);
 
-/* Forward declare some functions.  */
-static void __objc_init_install_dtable (id, SEL);
+static struct sarray *__objc_prepared_dtable_for_class (Class cls);
+static IMP __objc_get_prepared_imp (Class cls,SEL sel);
+  
 
 /* Various forwarding functions that are used based upon the
return type for the selector.
@@ -117,7 +122,7 @@ __objc_get_forward_imp (id rcv, SEL sel)
 {
   IMP result;
   if ((result = __objc_msg_forward (sel)) != NULL) 
-return result;
+   return result;
 }
 
   /* In all other cases, use the default forwarding functions built
@@ -210,7 +215,7 @@ __objc_resolve_instance_method (Class class, SEL s
{
  objc_mutex_lock (__objc_runtime_mutex);
  if (class-class_pointer-dtable == __objc_uninstalled_dtable)
-   __objc_install_dispatch_table_for_class (class-class_pointer);
+   __objc_install_dtable_for_class (class-class_pointer);
  objc_mutex_unlock (__objc_runtime_mutex);
}
   resolveMethodIMP = sarray_get_safe (class-class_pointer-dtable,
@@ -231,8 +236,94 @@ __objc_resolve_instance_method (Class class, SEL s
   return NULL;
 }
 
-/* Given a class and selector, return the selector's
-   implementation.  */
+/* Given a CLASS and selector, return the implementation corresponding
+   to the method of the selector.
+
+   If CLASS is a class, the instance method is returned.
+   If CLASS is a meta class, the class method is returned.
+
+   Since this requires the dispatch table to be installed, this function
+   will implicitly invoke +initialize for CLASS if it hasn't been
+   invoked yet.  This also insures that +initialize has been invoked
+   when the returned implementation is called directly.
+
+   The forwarding hooks require the receiver as an argument (if they are to
+   perform dynamic lookup in proxy objects etc), so this function has a
+   receiver argument to be used with those hooks.  */
+static inline
+IMP
+get_implementation (id receiver, Class class, SEL sel)
+{
+  void *res;
+
+  if (class-dtable == __objc_uninstalled_dtable)
+{
+  /* The dispatch table needs to be installed.  */
+  objc_mutex_lock (__objc_runtime_mutex);
+
+  /* Double-checked locking pattern: Check
+__objc_uninstalled_dtable again in case another thread
+installed the dtable while we were waiting for the lock
+to be released.  */
+  if (class-dtable == __objc_uninstalled_dtable)
+   {
+ __objc_install_dtable_for_class (class);
+   }
+
+  /* If the dispatch table is not yet installed,
+   we are still in the process of executing +initialize.
+   But the implementation pointer should be available
+   in the prepared ispatch table if it exists at all.  */
+  if (class-dtable == __objc_uninstalled_dtable)
+   {
+ assert (__objc_prepared_dtable_for_class (class) != 0);
+ res = __objc_get_prepared_imp (class, sel);
+   }
+  else
+   {
+ res = 0;
+   }
+  objc_mutex_unlock (__objc_runtime_mutex);
+  /* Call ourselves with the installed dispatch table and get
+the real method.  */
+  if (!res)
+   res = get_implementation (receiver, class, sel);
+}
+  else
+{
+  /* The dispatch table has been installed.  */
+  res = sarray_get_safe (class-dtable, (size_t) sel-sel_id);
+  if (res == 0)
+   {
+ /* The dispatch table has been installed, and the method
+is not in the dispatch table.  So the method just
+doesn't exist for the class.  */
+
+ /* Try going through the +resolveClassMethod: or
++resolveInstanceMethod: 

[PATCH, rs6000] Tidy up dumping of register/memory move cost

2011-05-25 Thread Pat Haugen
The following fixes a problem when dumping register costs, where the incorrect 
'from' value was being written out because the code modified the incoming 
parameter value. It also changes things so that register/memory costs are only 
dumped on the outermost call, eliminating intermediate output when a cost 
calculation requires going through memory or GPRs.


Bootstrap/regtest on powerpc64-linux with no new regressions. Ok for trunk?

-Pat


2011-05-25  Pat Haugen pthau...@us.ibm.com

* config/rs6000/rs6000.c (rs6000_register_move_cost): Preserve from
parameter value for dump. Dump cost on outermost call only.
(rs6000_memory_move_cost): Dump cost on outermost call only.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c	(revision 174138)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -189,6 +189,8 @@ enum reg_class rs6000_regno_regclass[FIR
 /* Reload functions based on the type and the vector unit.  */
 static enum insn_code rs6000_vector_reload[NUM_MACHINE_MODES][2];
 
+static int dbg_cost_ctrl;
+
 /* Built in types.  */
 tree rs6000_builtin_types[RS6000_BTI_MAX];
 tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
@@ -26428,26 +26430,31 @@ rs6000_register_move_cost (enum machine_
 {
   int ret;
 
+  if (TARGET_DEBUG_COST)
+dbg_cost_ctrl++;
+
   /*  Moves from/to GENERAL_REGS.  */
   if (reg_classes_intersect_p (to, GENERAL_REGS)
   || reg_classes_intersect_p (from, GENERAL_REGS))
 {
+  reg_class_t rclass = from;
+
   if (! reg_classes_intersect_p (to, GENERAL_REGS))
-	from = to;
+	rclass = to;
 
-  if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
-	ret = (rs6000_memory_move_cost (mode, from, false)
+  if (rclass == FLOAT_REGS || rclass == ALTIVEC_REGS || rclass == VSX_REGS)
+	ret = (rs6000_memory_move_cost (mode, rclass, false)
 	   + rs6000_memory_move_cost (mode, GENERAL_REGS, false));
 
   /* It's more expensive to move CR_REGS than CR0_REGS because of the
 	 shift.  */
-  else if (from == CR_REGS)
+  else if (rclass == CR_REGS)
 	ret = 4;
 
   /* Power6 has slower LR/CTR moves so make them more expensive than
 	 memory in order to bias spills to memory .*/
   else if (rs6000_cpu == PROCESSOR_POWER6
-	reg_classes_intersect_p (from, LINK_OR_CTR_REGS))
+	reg_classes_intersect_p (rclass, LINK_OR_CTR_REGS))
 ret = 6 * hard_regno_nregs[0][mode];
 
   else
@@ -26471,10 +26478,14 @@ rs6000_register_move_cost (enum machine_
 	   + rs6000_register_move_cost (mode, from, GENERAL_REGS));
 
   if (TARGET_DEBUG_COST)
-fprintf (stderr,
-	 rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n,
-	 ret, GET_MODE_NAME (mode), reg_class_names[from],
-	 reg_class_names[to]);
+{
+  if (dbg_cost_ctrl == 1)
+	fprintf (stderr,
+		 rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n,
+		 ret, GET_MODE_NAME (mode), reg_class_names[from],
+		 reg_class_names[to]);
+  dbg_cost_ctrl--;
+}
 
   return ret;
 }
@@ -26488,6 +26499,9 @@ rs6000_memory_move_cost (enum machine_mo
 {
   int ret;
 
+  if (TARGET_DEBUG_COST)
+dbg_cost_ctrl++;
+
   if (reg_classes_intersect_p (rclass, GENERAL_REGS))
 ret = 4 * hard_regno_nregs[0][mode];
   else if (reg_classes_intersect_p (rclass, FLOAT_REGS))
@@ -26498,9 +26512,13 @@ rs6000_memory_move_cost (enum machine_mo
 ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
 
   if (TARGET_DEBUG_COST)
-fprintf (stderr,
-	 rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n,
-	 ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+{
+  if (dbg_cost_ctrl == 1)
+	fprintf (stderr,
+		 rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n,
+		 ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+  dbg_cost_ctrl--;
+}
 
   return ret;
 }


Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread Uros Bizjak
On Wed, May 25, 2011 at 6:20 PM, H.J. Lu hjl.to...@gmail.com wrote:

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression? and Replace
       q_regs_operand with QIreg_operand.
       (

 If this is the case, then please change q_regs_operand predicate to
 accept just QI_REG_P registers.


 I thought about it.  It is a problem only with %[abcd]h.  I am not sure if
 changing q_regs_operand to  accept just QI_REG_P registers will negatively
 impact

 I see. The patch is OK then, but for consistency, please change the
 predicate of *movqi_extv_1*movqi_extzv_2 as well. Oh, and the
 register_operand check in type calculation can be removed.

 Thanks,
 Uros.


 This is what I checked in.

 Thanks.

 --
 H.J.
 ---
 2011-05-25  H.J. Lu  hongjiu...@intel.com

        PR target/49142
        * config/i386/i386.md (*movqi_extv_1_rex64): Remove
        register_operand check and replace q_regs_operand with
        QIreg_operand in type calculation.
        (*movqi_extv_1): Likewise.
        (*movqi_extzv_2_rex64): Likewise.
        (*movqi_extzv_2): Likewise.

Er, I didn't mean to remove register_operand check from 32bit
patterns... there, operand 0 can also be memory operand due to
nonimmediate_operand constraint.

Uros.


Re: PATCH: Add pause intrinsic

2011-05-25 Thread Basile Starynkevitch
On Wed, 25 May 2011 11:26:51 +0100
Andrew Haley a...@redhat.com wrote:

 On 05/24/2011 07:28 PM, H.J. Lu wrote:
 
  This patch implements pause intrinsic suggested by Andi.  OK
  for trunk?
 
 What does full memory barrier here mean?
 
 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table
 
 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

Perhaps the doc might explain why is it necessary to have a builtin for
two independent roles: first, the full compiler memory barrier (which
probably means to spill all the registers on the stack - definitely a
task for a compiler); second, to pause the processor (which might
also mean to flush or invalidate some data caches). In particular, I
would naively imagine that we might have a more generic builtin for the
compiler memory barrier (which probably could be independent of the
particular ia32 target), and in that case which can't we just implement
the pause ia32 builtin as builtin_compiler_barrier(); asm (pause)?

I find the above documentation too short and (being a non native
English speaker) I would prefer it to be much longer. I am not able to
suggest better phrasing (because I still did not entirely understood
what that builtin_ia32_pause is useful or needed).

And if there was a builtin_compiler_barrier () I would believe it can
have a lot of other uses. Any generated C code which wants some
introspection or some garbage collection write barrier might want it
too! [perhaps even I might find later such thing useful in C code
generated by MELT]

Regards.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


Re: Create common hooks structure shared between driver and cc1

2011-05-25 Thread Joseph S. Myers
Here is a revised version of my patch
http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01779.html to create
the common hooks structure.  Tested in the same way as the original
patch.  OK to commit?

In the course of working on moving hooks to the new structure, I found
that every target architecture except for moxie has at least one of
the hooks that will move.  I didn't want to create 35 separate
makefile rules with manually maintained dependencies to build the
associated new source files for those hooks (with, probably, many new
t-* files to contain those rules and dependencies).  So this patch
makes $arch-common.o follow a similar approach to $arch.o: have a
single common_out_file and common_out_object_file instead of the
previous more general extra_common_objs, and have a shared makefile
rule, with a standard set of dependencies, in gcc/Makefile.in.  (This
means the patch no longer touches pa/t-pa, which is again an unused
file along with i386/t-crtpic and i386/t-svr3dbx; note that
$arch/t-$arch is *only* automatically used if config.gcc leaves
tmake_file completely empty, otherwise config.gcc needs to add such a
file explicitly to tmake_file to cause it to be used.)

There are actually two separate things I wanted to avoid replicating
35-fold: manually maintained rules, and manually maintained
dependencies.  For the latter, I looked again at Tom's reverted patch
from March 2008 to use automatic dependency generation.  Although
there is now a fixed GNU make release (since last July), and although
I'd like to see automatic dependency generation go in (I hope Paolo
may follow up on it as per
http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01721.html), actually
getting it working looks like a potential rathole and I couldn't
figure out from the 2008 thread what the makefile feature was that
caused problems with a GNU make bug and whether it would be possible
to disable use of that feature (instead having stupid dependencies of
all objects on all headers) when using older versions of make (which
would I think be desirable, to avoid forcing everyone to upgrade make
immediately).

For the former (avoided in this patch by having one new rule instead
of 35), as far as I can see the main reason extra rules are involved
for target-specific files is just to build them directly in the gcc/
object directory, instead of with a .o name matching the path to the
source directory ($objdir/config/i386/i386.o, etc.).  So most of them
could probably be eliminated (leaving only dependencies) by putting
all objects in subdirectories (and making configure.ac create all
those directories).  Some involve extra compiler options, which the
dependencies patch dealt with by using GNU make variable settings
applying to individual makefile targets, and I suppose it might be
possible to use that feature separately from automatic dependency
generation - though the way those settings apply to dependencies of
the target being built could cause problems, so maybe we should
instead put $($@-CPPFLAGS) or similar in CPPFLAGS - though if it were
that simple to avoid the replication of compilation commands, I'd
expect this to be done already.

2011-05-25  Joseph Myers  jos...@codesourcery.com

* common/common-target-def.h, common/common-target.def,
common/common-target.h, common/config/default-common.c,
common/config/pa/pa-common.c: New files.
* Makefile.in (common_out_file, common_out_object_file,
COMMON_TARGET_H, COMMON_TARGET_DEF_H): New.
(OBJS-libcommon-target): Include $(common_out_object_file).
(prefix.o): Update dependencies.
($(common_out_object_file), common/common-target-hooks-def.h,
s-common-target-hooks-def-h): New.
(s-tm-texi): Also check timestamp on common-target.def.
(build/genhooks.o): Update dependencies.
* config.gcc (common_out_file, target_has_targetm_common): Define.
* config/pa/som.h (ALWAYS_STRIP_DOTDOT): Replace with
TARGET_ALWAYS_STRIP_DOTDOT.
* configure.ac (common_out_object_file): Define.
(common_out_file, common_out_object_file): Substitute.
(common): Create directory.
* configure: Regenerate.
* doc/tm.texi.in (targetm_common): Document.
(TARGET_ALWAYS_STRIP_DOTDOT): Add @hook entry.
* doc/tm.texi: Regenerate.
* genhooks.c (hook_array): Also include common/common-target.def.
* prefix.c (tm.h): Don't include.
(common/common-target.h): Include.
(ALWAYS_STRIP_DOTDOT): Don't define.
(update_path): Use targetm_common.always_strip_dotdot instead of
ALWAYS_STRIP_DOTDOT.
* system.h (ALWAYS_STRIP_DOTDOT): Poison.

Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi (revision 174109)
+++ gcc/doc/tm.texi (working copy)
@@ -99,6 +99,16 @@ initializer @code{TARGETCM_INITIALIZER} 
 themselves, they should set @code{target_has_targetcm=yes} in
 

Re: [PATCH][4.6] detect C++ errors to fix 2288 and 18770

2011-05-25 Thread Nathan Froyd
On Sun, May 22, 2011 at 03:25:41PM -0700, H.J. Lu wrote:
 FWIW, I tried Janis's patch on 4.6 branch and I got
 
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C: In
 function 'void e1()':^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C:29:11:
 error: redeclaration of 'int k'^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C:27:12:
 error: 'int k' previously declared here^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C: In
 function 'void e4()':^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C:63:11:
 error: redeclaration of 'int i'^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.dg/parse/pr18770.C:61:14:
 error: 'int i' previously declared here^M
 
 FAIL: g++.dg/parse/pr18770.C prev (test for errors, line 14)
 FAIL: g++.dg/parse/pr18770.C redecl (test for errors, line 17)
 PASS: g++.dg/parse/pr18770.C prev (test for errors, line 27)
 PASS: g++.dg/parse/pr18770.C redecl (test for errors, line 29)
 FAIL: g++.dg/parse/pr18770.C prev (test for errors, line 37)
 FAIL: g++.dg/parse/pr18770.C redecl (test for errors, line 39)
 FAIL: g++.dg/parse/pr18770.C prev (test for errors, line 47)
 FAIL: g++.dg/parse/pr18770.C redecl (test for errors, line 53)
 PASS: g++.dg/parse/pr18770.C prev (test for errors, line 61)
 PASS: g++.dg/parse/pr18770.C redecl (test for errors, line 63)
 FAIL: g++.dg/parse/pr18770.C prev (test for errors, line 71)
 FAIL: g++.dg/parse/pr18770.C redecl (test for errors, line 73)
 PASS: g++.dg/parse/pr18770.C (test for excess errors)
 
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:
 In function 'int main()':^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:22:11:
 error: redeclaration of 'int i'^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:20:14:
 error: 'int i' previously declared here^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:27:11:
 error: redeclaration of 'int i'^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:25:14:
 error: 'int i' previously declared here^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:36:16:
 error: types may not be defined in conditions^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:39:3:
 error: 'A' was not declared in this scope^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:39:5:
 error: expected ';' before 'bar'^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:42:12:
 error: types may not be defined in conditions^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:42:40:
 error: 'one' was not declared in this scope^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:51:14:
 warning: declaration of 'int f()' has 'extern' and is initialized
 [enabled by default]^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:51:18:
 error: function 'int f()' is initialized like a variable^M
 /export/gnu/import/git/gcc/gcc/testsuite/g++.old-deja/g++.jason/cond.C:55:23:
 error: extended initializer lists only available with -std=c++0x or
 -std=gnu++0x^M
 
 FAIL: g++.old-deja/g++.jason/cond.C  (test for errors, line 9)
 FAIL: g++.old-deja/g++.jason/cond.C  (test for errors, line 11)
 FAIL: g++.old-deja/g++.jason/cond.C  (test for errors, line 16)
 PASS: g++.old-deja/g++.jason/cond.C  (test for errors, line 20)
 PASS: g++.old-deja/g++.jason/cond.C  (test for errors, line 22)
 PASS: g++.old-deja/g++.jason/cond.C  (test for errors, line 25)
 PASS: g++.old-deja/g++.jason/cond.C  (test for errors, line 27)
 FAIL: g++.old-deja/g++.jason/cond.C  (test for errors, line 30)
 FAIL: g++.old-deja/g++.jason/cond.C  (test for errors, line 33)
 PASS: g++.old-deja/g++.jason/cond.C  (test for errors, line 36)
 PASS: g++.old-deja/g++.jason/cond.C decl (test for errors, line 39)
 PASS: g++.old-deja/g++.jason/cond.C exp (test for errors, line 39)
 PASS: g++.old-deja/g++.jason/cond.C def (test for errors, line 42)
 PASS: g++.old-deja/g++.jason/cond.C expected (test for errors, line 42)
 PASS: g++.old-deja/g++.jason/cond.C extern (test for warnings, line 51)
 
 The patch no longer catches all problems.

The patch just requires some shuffling of logic to catch issues now;
below is a version that works for me on the trunk.

This new checking does require modifying g++.dg/cpp0x/range-for5.C.  The
new logic of the patch claims that:

  int i;
  for (int i : a)
  {
int i;
  }

is incorrect (the innermost `i' is an erroneous redeclaration).  If you
apply the expansion of range-based for loops from [stmt.ranged]p1, you'd
get something like:

  for (...; ...; ...)
   {
 int i = ...;
 int i;
   }

which is bad.  I believe [basic.scope.local]p4 says much the same thing.

Tested with g++ testsuite on x86_64-unknown-linux-gnu; tests in progress
for libstdc++.  OK to commit?

-Nathan

gcc/cp/
2011-xx-xx  Janis 

Re: PATCH: PR target/49142: Invalid 8bit register operand

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 12:11 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Wed, May 25, 2011 at 6:20 PM, H.J. Lu hjl.to...@gmail.com wrote:

 We are working on a new optimization, which turns off TARGET_MOVX.
 GCC generates:

 movb %ah, %dil

 But %ah can only be used with %[abcd][hl].  This patch adds QIreg_operand
 and uses it in *movqi_extv_1_rex64/*movqi_extzv_2_rex64.  OK for trunk
 if there is no regression? and Replace
       q_regs_operand with QIreg_operand.
       (

 If this is the case, then please change q_regs_operand predicate to
 accept just QI_REG_P registers.


 I thought about it.  It is a problem only with %[abcd]h.  I am not sure if
 changing q_regs_operand to  accept just QI_REG_P registers will negatively
 impact

 I see. The patch is OK then, but for consistency, please change the
 predicate of *movqi_extv_1*movqi_extzv_2 as well. Oh, and the
 register_operand check in type calculation can be removed.

 Thanks,
 Uros.


 This is what I checked in.

 Thanks.

 --
 H.J.
 ---
 2011-05-25  H.J. Lu  hongjiu...@intel.com

        PR target/49142
        * config/i386/i386.md (*movqi_extv_1_rex64): Remove
        register_operand check and replace q_regs_operand with
        QIreg_operand in type calculation.
        (*movqi_extv_1): Likewise.
        (*movqi_extzv_2_rex64): Likewise.
        (*movqi_extzv_2): Likewise.

 Er, I didn't mean to remove register_operand check from 32bit
 patterns... there, operand 0 can also be memory operand due to
 nonimmediate_operand constraint.

Ooops.  I am checking in this.

Thanks.

-- 
H.J.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ed1834f..1afef8e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2011-05-25  H.J. Lu  hongjiu...@intel.com

+   * config/i386/i386.md (*movqi_extv_1)): Put back
+   register_operand check in type calculation.
+   (*movqi_extzv_2): Likewise.
+
+2011-05-25  H.J. Lu  hongjiu...@intel.com
+
* doc/extend.texi (X86 Built-in Functions): Update pause
intrinsic.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 1cdbe7e..13a1cde 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2514,9 +2514,10 @@
 }
 }
   [(set (attr type)
- (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
-   (ne (symbol_ref TARGET_MOVX)
-   (const_int 0)))
+ (if_then_else (and (match_operand:QI 0 register_operand )
+   (ior (not (match_operand:QI 0 QIreg_operand ))
+(ne (symbol_ref TARGET_MOVX)
+(const_int 0
(const_string imovx)
(const_string imov)))
(set (attr mode)
@@ -2578,9 +2579,10 @@
 }
 }
   [(set (attr type)
- (if_then_else (ior (not (match_operand:QI 0 QIreg_operand ))
-   (ne (symbol_ref TARGET_MOVX)
-   (const_int 0)))
+ (if_then_else (and (match_operand:QI 0 register_operand )
+   (ior (not (match_operand:QI 0 QIreg_operand ))
+(ne (symbol_ref TARGET_MOVX)
+(const_int 0
(const_string imovx)
(const_string imov)))
(set (attr mode)


Re: PATCH: Add pause intrinsic

2011-05-25 Thread H.J. Lu
On Wed, May 25, 2011 at 12:17 PM, Basile Starynkevitch
bas...@starynkevitch.net wrote:
 On Wed, 25 May 2011 11:26:51 +0100
 Andrew Haley a...@redhat.com wrote:

 On 05/24/2011 07:28 PM, H.J. Lu wrote:

  This patch implements pause intrinsic suggested by Andi.  OK
  for trunk?

 What does full memory barrier here mean?

 +@table @code
 +@item void __builtin_ia32_pause (void)
 +Generates the @code{pause} machine instruction with full memory barrier.
 +@end table

 There a memory clobber, but no barrier instruction AFAICS.  The
 doc needs to explain it a bit better.

 Perhaps the doc might explain why is it necessary to have a builtin for
 two independent roles: first, the full compiler memory barrier (which
 probably means to spill all the registers on the stack - definitely a
 task for a compiler); second, to pause the processor (which might
 also mean to flush or invalidate some data caches). In particular, I
 would naively imagine that we might have a more generic builtin for the
 compiler memory barrier (which probably could be independent of the
 particular ia32 target), and in that case which can't we just implement
 the pause ia32 builtin as builtin_compiler_barrier(); asm (pause)?


We may need

 builtin_compiler_barrier();
 asm (pause);
builtin_compiler_barrier();



-- 
H.J.


C++ PATCH for c++/46696 (error with defaulted op= and arrays)

2011-05-25 Thread Jason Merrill
Another case where we now need to check DECL_DEFAULTED_FN rather than 
DECL_ARTIFICIAL.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 3ac89bd9f5f81b4d3ff293b337e7e9163d3402dd
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 12:05:03 2011 -0400

	PR c++/46696
	* typeck.c (cp_build_modify_expr): Check DECL_DEFAULTED_FN.

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 69b25d3..5fbb765 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -6748,7 +6748,7 @@ cp_build_modify_expr (tree lhs, enum tree_code modifycode, tree rhs,
 
   /* Allow array assignment in compiler-generated code.  */
   else if (!current_function_decl
-	   || !DECL_ARTIFICIAL (current_function_decl))
+	   || !DECL_DEFAULTED_FN (current_function_decl))
 	{
   /* This routine is used for both initialization and assignment.
  Make sure the diagnostic message differentiates the context.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted29.C b/gcc/testsuite/g++.dg/cpp0x/defaulted29.C
new file mode 100644
index 000..5fcf5b0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/defaulted29.C
@@ -0,0 +1,20 @@
+// PR c++/46696
+// { dg-options -std=c++0x }
+
+struct A
+{
+  A operator= (A const);
+};
+
+struct B
+{
+  A ar[1];
+  B operator= (B const) = default;
+};
+
+int main()
+{
+  B x;
+  B y;
+  y = x;
+}


C++ PATCH for c++/47184 (list-initialized temporary in parenthesized initializer)

2011-05-25 Thread Jason Merrill

cp_parser_parameter_declaration is clever enough to tell that when we see

Type1 id(Type2

if the next token doesn't indicate a cast, we're dealing with a function 
declarator.  But it was only checking for '('; now it needs to check for 
'{' as well.


After making that fix, I needed to change cp_parser_direct_declarator to 
not assume that we successfully parsed a parameter list until we see the 
closing ')'.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 365eff32e0004b7e3ac0794a2fbb5d6585f4b4d7
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 11:44:48 2011 -0400

	PR c++/47184
	* parser.c (cp_parser_parameter_declaration): Recognize
	list-initialization.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index db2cb96..004ff05 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -14901,6 +14901,9 @@ cp_parser_direct_declarator (cp_parser* parser,
 	  parser-num_template_parameter_lists
 		= saved_num_template_parameter_lists;
 
+	  /* Consume the `)'.  */
+	  cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
+
 	  /* If all went well, parse the cv-qualifier-seq and the
 		 exception-specification.  */
 	  if (member_p || cp_parser_parse_definitely (parser))
@@ -14915,8 +14918,6 @@ cp_parser_direct_declarator (cp_parser* parser,
 		  if (ctor_dtor_or_conv_p)
 		*ctor_dtor_or_conv_p = *ctor_dtor_or_conv_p  0;
 		  first = false;
-		  /* Consume the `)'.  */
-		  cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
 
 		  /* Parse the cv-qualifier-seq.  */
 		  cv_quals = cp_parser_cv_qualifier_seq_opt (parser);
@@ -16053,6 +16054,7 @@ cp_parser_parameter_declaration (cp_parser *parser,
 	 of some object of type char to int.  */
 	   !parser-in_type_id_in_expr_p
 	   cp_parser_uncommitted_to_tentative_parse_p (parser)
+	   cp_lexer_next_token_is_not (parser-lexer, CPP_OPEN_BRACE)
 	   cp_lexer_next_token_is_not (parser-lexer, CPP_OPEN_PAREN))
 	cp_parser_commit_to_tentative_parse (parser);
   /* Parse the declarator.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist51.C b/gcc/testsuite/g++.dg/cpp0x/initlist51.C
new file mode 100644
index 000..9163dd3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist51.C
@@ -0,0 +1,15 @@
+// PR c++/47184
+// { dg-options -std=c++0x }
+
+struct S
+{
+  int a;
+};
+struct T
+{
+  T(S s) {}
+};
+int main()
+{
+  T t(S{1});
+}


C++ PATCHes for c++/46245 and c++/46145 (auto issues)

2011-05-25 Thread Jason Merrill
In 46245, we were complaining too soon about an auto parameter; we need 
to wait until after we splice in a late-specified return type.  In 
46145, we were failing to complain about an auto typedef.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0ca632627d749d168b602675ca48df9e88a1eac5
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 13:03:13 2011 -0400

	PR c++/46145
	* decl.c (grokdeclarator): Complain about auto typedef.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 68dc999..db52184 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -9503,6 +9503,12 @@ grokdeclarator (const cp_declarator *declarator,
   memfn_quals = TYPE_UNQUALIFIED;
 }
 
+  if (type_uses_auto (type))
+	{
+	  error (typedef declared %auto%);
+	  type = error_mark_node;
+	}
+
   if (decl_context == FIELD)
 	decl = build_lang_decl (TYPE_DECL, unqualified_id, type);
   else
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto9.C b/gcc/testsuite/g++.dg/cpp0x/auto9.C
index 142ef90..190bfa6 100644
--- a/gcc/testsuite/g++.dg/cpp0x/auto9.C
+++ b/gcc/testsuite/g++.dg/cpp0x/auto9.C
@@ -119,3 +119,6 @@ Hauto h;	// { dg-error invalid }
 
 void qq (auto);			// { dg-error auto }
 void qr (auto*);		// { dg-error auto }
+
+// PR c++/46145
+typedef auto autot;		// { dg-error auto }
commit 2ab4982d07fd89b0a7bc42868aa655173a132af7
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 12:22:13 2011 -0400

	PR c++/46245
	* decl.c (grokdeclarator): Complain later for auto parameter.
	* pt.c (splice_late_return_type): Handle use in a template
	type-parameter.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index cc09c1d..68dc999 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8763,12 +8763,6 @@ grokdeclarator (const cp_declarator *declarator,
 	   || thread_p)
 	error (storage class specifiers invalid in parameter declarations);
 
-  if (type_uses_auto (type))
-	{
-	  error (parameter declared %auto%);
-	  type = error_mark_node;
-	}
-
   /* Function parameters cannot be constexpr.  If we saw one, moan
  and pretend it wasn't there.  */
   if (constexpr_p)
@@ -9749,6 +9743,12 @@ grokdeclarator (const cp_declarator *declarator,
   if (ctype || in_namespace)
 	error (cannot use %::% in parameter declaration);
 
+  if (type_uses_auto (type))
+	{
+	  error (parameter declared %auto%);
+	  type = error_mark_node;
+	}
+
   /* A parameter declared as an array of T is really a pointer to T.
 	 One declared as a function is really a pointer to a function.
 	 One declared as a member is really a pointer to member.  */
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bb4515b..c3c759e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19315,7 +19315,12 @@ splice_late_return_type (tree type, tree late_return_type)
 return type;
   argvec = make_tree_vec (1);
   TREE_VEC_ELT (argvec, 0) = late_return_type;
-  if (processing_template_decl)
+  if (processing_template_parmlist)
+/* For a late-specified return type in a template type-parameter, we
+   need to add a dummy argument level for its parmlist.  */
+argvec = add_to_template_args
+  (make_tree_vec (processing_template_parmlist), argvec);
+  if (current_template_parms)
 argvec = add_to_template_args (current_template_args (), argvec);
   return tsubst (type, argvec, tf_warning_or_error, NULL_TREE);
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto23.C b/gcc/testsuite/g++.dg/cpp0x/auto23.C
new file mode 100644
index 000..49b5a0e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/auto23.C
@@ -0,0 +1,4 @@
+// PR c++/46245
+// { dg-options -std=c++0x }
+
+templateauto f()-int struct A { };
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto9.C b/gcc/testsuite/g++.dg/cpp0x/auto9.C
index ab90be5..142ef90 100644
--- a/gcc/testsuite/g++.dg/cpp0x/auto9.C
+++ b/gcc/testsuite/g++.dg/cpp0x/auto9.C
@@ -79,10 +79,10 @@ enum struct D : auto * { FF = 0 };		// { dg-error must be an integral type|decl
 void
 bar ()
 {
-  try { } catch (auto i) { }			// { dg-error invalid use of }
-  try { } catch (auto) { }			// { dg-error invalid use of }
-  try { } catch (auto *i) { }			// { dg-error invalid use of }
-  try { } catch (auto *) { }			// { dg-error invalid use of }
+  try { } catch (auto i) { }			// { dg-error parameter declared }
+  try { } catch (auto) { }			// { dg-error parameter declared }
+  try { } catch (auto *i) { }			// { dg-error parameter declared }
+  try { } catch (auto *) { }			// { dg-error parameter declared }
 }
 
 void


C++ PATCH for c++/45698 (crash with variadics)

2011-05-25 Thread Jason Merrill
45698 was actually fixed in 4.5.0, but before I closed it I checked to 
see how the testcase was doing with the current compiler, and found that 
it was crashing again.  This turned out to be because of Nathan's recent 
tree-slimming work; ARGUMENT_PACK_SELECT doesn't have TREE_TYPE, so we 
crash when we try to look at it in value_dependent_expression_p.  But we 
shouldn't be treating it as an expression in the first place, since it 
could be either a type or value argument.


Fixed by looking through ARGUMENT_PACK_SELECT before we decide what sort 
of template argument we're dealing with.


While looking at this, I also noticed that print_node expects everything 
to have TREE_TYPE, which is no longer correct.  And I made print_node 
more useful for ARGUMENT_PACK_SELECT.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0b5532a57ea85765d6baed5eff0abaaabac1aaaf
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 13:24:47 2011 -0400

	PR c++/45698
	* pt.c (dependent_template_arg_p): See through ARGUMENT_PACK_SELECT.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c3c759e..c9c25cd 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18759,6 +18759,9 @@ dependent_template_arg_p (tree arg)
   if (arg == error_mark_node)
 return true;
 
+  if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT)
+arg = ARGUMENT_PACK_SELECT_ARG (arg);
+
   if (TREE_CODE (arg) == TEMPLATE_DECL
   || TREE_CODE (arg) == TEMPLATE_TEMPLATE_PARM)
 return dependent_template_p (arg);
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic110.C b/gcc/testsuite/g++.dg/cpp0x/variadic110.C
new file mode 100644
index 000..86f1bb1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic110.C
@@ -0,0 +1,15 @@
+// PR c++/45698
+// { dg-options -std=c++0x }
+
+template class... Ts struct tuple { };
+
+templateclass... Ts
+struct A {
+  templatetypename T struct N { };
+  tupleNTs... tup;
+};
+
+int main()
+{
+  Aint, double a;
+}
commit 46cccd60afea40407a278f6d937373e0121c24ee
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 25 13:32:05 2011 -0400

	* print-tree.c (print_node): Only look at TREE_TYPE if TS_TYPED.
	* cp/ptree.c (cxx_print_xnode): Handle ARGUMENT_PACK_SELECT.

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index a4c3ed5..5c9626e 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -221,6 +221,12 @@ cxx_print_xnode (FILE *file, tree node, int indent)
 	  fprintf (file, pending_template);
 	}
   break;
+case ARGUMENT_PACK_SELECT:
+  print_node (file, pack, ARGUMENT_PACK_SELECT_FROM_PACK (node),
+		  indent+4);
+  indent_to (file, indent + 3);
+  fprintf (file, index %d, ARGUMENT_PACK_SELECT_INDEX (node));
+  break;
 default:
   break;
 }
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index 3b5edeb..58c9613 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -321,7 +321,7 @@ print_node (FILE *file, const char *prefix, tree node, int indent)
   if (indent = 4)
 	print_node_brief (file, type, TREE_TYPE (node), indent + 4);
 }
-  else
+  else if (CODE_CONTAINS_STRUCT (code, TS_TYPED))
 {
   print_node (file, type, TREE_TYPE (node), indent + 4);
   if (TREE_TYPE (node))


[PATCH, testsuite] Additional tests for PR46728 (PR46728 patch 4)

2011-05-25 Thread William J. Schmidt
Since I'm in process of moving the lowering of pow and powi calls from
expand into gimple, I wrote some tests to improve coverage in this area.
Most of these look for specific code generation patterns in PowerPC
assembly where the existence of a hardware floating square root can be
guaranteed.

This patch is conditional on patch 3 of the PR46728 series; without it,
test pr46728-16.c will fail, since the FMA will not be generated.  All
other tests currently pass.

OK to add to test suite on trunk?

Thanks,
Bill


2011-05-25  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.target/powerpc/pr46728-1.c: New.
* gcc.target/powerpc/pr46728-2.c: New.
* gcc.target/powerpc/pr46728-3.c: New.
* gcc.target/powerpc/pr46728-4.c: New.
* gcc.target/powerpc/pr46728-5.c: New.
* gcc.dg/pr46728-6.c: New.
* gcc.target/powerpc/pr46728-7.c: New.
* gcc.target/powerpc/pr46728-8.c: New.
* gcc.dg/pr46728-9.c: New.
* gcc.target/powerpc/pr46728-10.c: New.
* gcc.target/powerpc/pr46728-11.c: New.
* gcc.dg/pr46728-12.c: New.
* gcc.target/powerpc/pr46728-13.c: New.
* gcc.target/powerpc/pr46728-14.c: New.
* gcc.target/powerpc/pr46728-15.c: New.
* gcc.target/powerpc/pr46728-16.c: New.

Index: gcc/testsuite/gcc.target/powerpc/pr46728-13.c
===
--- gcc/testsuite/gcc.target/powerpc/pr46728-13.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr46728-13.c   (revision 0)
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options -O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt } */
+
+#include math.h
+
+extern void abort (void);
+
+#define NVALS 6
+
+static double
+convert_it (double x)
+{
+  return pow (x, 1.0 / 6.0);
+}
+
+int
+main (int argc, char *argv[])
+{
+  double values[NVALS] = { 3.0, 1.95, 2.227, 729.0, 64.0, .0008797 };
+  unsigned i;
+
+  for (i = 0; i  NVALS; i++)
+if (convert_it (values[i]) != cbrt (sqrt (values[i])))
+  abort ();
+
+  return 0;
+}
Index: gcc/testsuite/gcc.target/powerpc/pr46728-3.c
===
--- gcc/testsuite/gcc.target/powerpc/pr46728-3.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr46728-3.c(revision 0)
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt } */
+
+#include math.h
+
+extern void abort (void);
+
+#define NVALS 6
+
+static double
+convert_it (double x)
+{
+  return pow (x, 0.75);
+}
+
+int
+main (int argc, char *argv[])
+{
+  double values[NVALS] = { 3.0, 1.95, 2.227, 4.0, 256.0, .0008797 };
+  unsigned i;
+
+  for (i = 0; i  NVALS; i++)
+if (convert_it (values[i]) != sqrt(values[i]) * sqrt (sqrt (values[i])))
+  abort ();
+
+  return 0;
+}
+
+
+/* { dg-final { scan-assembler-times sqrt 4 { target powerpc*-*-* } } } */
+/* { dg-final { scan-assembler-not pow { target powerpc*-*-* } } } */
Index: gcc/testsuite/gcc.target/powerpc/pr46728-14.c
===
--- gcc/testsuite/gcc.target/powerpc/pr46728-14.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr46728-14.c   (revision 0)
@@ -0,0 +1,78 @@
+/* { dg-do run } */
+/* { dg-options -O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt } */
+
+#include math.h
+
+extern void abort (void);
+
+#define NVALS 6
+
+static double
+convert_it_1 (double x)
+{
+  return pow (x, 1.5);
+}
+
+static double
+convert_it_2 (double x)
+{
+  return pow (x, 2.5);
+}
+
+static double
+convert_it_3 (double x)
+{
+  return pow (x, -0.5);
+}
+
+static double
+convert_it_4 (double x)
+{
+  return pow (x, 10.5);
+}
+
+static double
+convert_it_5 (double x)
+{
+  return pow (x, -3.5);
+}
+
+int
+main (int argc, char *argv[])
+{
+  double values[NVALS] = { 3.0, 1.95, 2.227, 4.0, 256.0, .0008797 };
+  double PREC = .99;
+  unsigned i;
+
+  for (i = 0; i  NVALS; i++)
+{
+  volatile double x, y;
+
+  x = sqrt (values[i]);
+  y = __builtin_powi (values[i], 1);
+  if (fabs (convert_it_1 (values[i]) / (x * y))  PREC)
+   abort ();
+
+  x = sqrt (values[i]);
+  y = __builtin_powi (values[i], 2);
+  if (fabs (convert_it_2 (values[i]) / (x * y))  PREC)
+   abort ();
+
+  x = sqrt (values[i]);
+  y = __builtin_powi (values[i], -1);
+  if (fabs (convert_it_3 (values[i]) / (x * y))  PREC)
+   abort ();
+
+  x = sqrt (values[i]);
+  y = __builtin_powi (values[i], 10);
+  if (fabs (convert_it_4 (values[i]) / (x * y))  PREC)
+   abort ();
+
+  x = sqrt (values[i]);
+  y = __builtin_powi (values[i], -4);
+  if (fabs (convert_it_5 (values[i]) / (x * y))  PREC)
+   abort ();
+}
+
+  return 0;
+}
Index: gcc/testsuite/gcc.target/powerpc/pr46728-4.c
===
--- 

Re: PATCH: Add pause intrinsic

2011-05-25 Thread Basile Starynkevitch
On Wed, 25 May 2011 12:31:17 -0700
H.J. Lu hjl.to...@gmail.com wrote:

 On Wed, May 25, 2011 at 12:17 PM, Basile Starynkevitch
 bas...@starynkevitch.net wrote:
  Perhaps the doc might explain why is it necessary to have a builtin for
  two independent roles: first, the full compiler memory barrier (which
  probably means to spill all the registers on the stack - definitely a
  task for a compiler); second, to pause the processor (which might
  also mean to flush or invalidate some data caches). In particular, I
  would naively imagine that we might have a more generic builtin for the
  compiler memory barrier (which probably could be independent of the
  particular ia32 target), and in that case which can't we just implement
  the pause ia32 builtin as builtin_compiler_barrier(); asm (pause)?
 
 
 We may need
 
  builtin_compiler_barrier();
  asm (pause);
 builtin_compiler_barrier();

I don't understand why the second builtin_compiler_barrier() after the
asm (pause) would be needed? Could you please explain why should we
need it? My feeling was that after the first builtin_compiler_barrier
() and hence after the asm (pause) no register would contain valid
data, and the compiler would have to reload from memory everything. So
why do you think the second is needed???

Or perhaps I misunderstood completely all the issues!


-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


  1   2   >