date:20110330

Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Chung-Lin Tang

On 2011/3/30 上午 12:23, Richard Earnshaw wrote:
 
 On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
 On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
 On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
 On 2011/3/24 06:51 PM, Richard Earnshaw wrote:

 On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
 Hi,
 PR48250 happens under TARGET_NEON, where DImode is included within the
 valid NEON modes. This turns the range of legitimate constant indexes to
 step-4 (coproc load/store), thus arm_legitimize_reload_address() when
 trying to decompose the [reg+index] reload address into
 [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
 part is not aligned to 4.

 I'm not sure why the current DImode index is computed as:
 low = ((val  0xf) ^ 0x8) - 0x8;  the sign-extending into negative
 values, then subtracting back, actually creates further off indexes.
 e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].


 Hysterical Raisins... the code there was clearly written for the days
 before we had LDRD in the architecture.  At that time the most efficient
 way to load a 64-bit object was to use the LDM{ia,ib,da,db}
 instructions.  The computation here was (I think), intended to try and
 make the most efficient use of an add/sub instruction followed by
 LDM/STM offsetting.  At that time the architecture had no unaligned
 access either, so dealing with 64-bit that were less than 32-bit aligned
 (in those days 32-bit was the maximum alignment) probably wasn't
 considered, or couldn't even get through to reload.


 I see it now. The code in output_move_double() returning assembly for
 ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.

 I have changed the patch to let the new code handle the TARGET_LDRD case
 only.  The pre-LDRD case is still handled by the original code, with an
 additional  ~0x3 for aligning the offset to 4.

 I've also added a comment for the pre-TARGET_LDRD case. Please see if
 the description is accurate enough.

 My patch changes the index decomposing to a more straightforward way; it
 also sort of outlines the way the other reload address indexes are
 broken by using and-masks, is not the most effective.  The address is
 computed by addition, subtracting away the parts to obtain low+high
 should be the optimal way of giving the largest computable index range.

 I have included a few Thumb-2 bits in the patch; I know currently
 arm_legitimize_reload_address() is only used under TARGET_ARM, but I
 guess it might eventually be turned into TARGET_32BIT.


 I think this needs to be looked at carefully on ARMv4/ARMv4T to check
 that it doesn't cause regressions there when we don't have LDRD in the
 instruction set.

 I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
 Okay for trunk if no regressions?

 Thanks,
 Chung-Lin

PR target/48250
* config/arm/arm.c (arm_legitimize_reload_address): Adjust
DImode constant index decomposing under TARGET_LDRD. Clear
lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
comment for !TARGET_LDRD case.

 This looks technically correct, but I can't help feeling that trying to
 deal with the bottom two bits being non-zero is not really worthwhile.
 This hook is an optimization that's intended to generate better code
 than the default mechanisms that reload provides.  It is allowed to
 fail, but it must say so if it does (by returning false).

 What this hook is trying to do for DImode is to take an address of the
 form (reg + TOO_BIG_CONST) and turn it into two instructions that are
 legitimate:

 tmp = (REG + LEGAL_BIG_CONST)
 some_use_of (mem (tmp + SMALL_LEGAL_CONST))

 The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
 instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
 impossible in ARM state) that pushing the bottom two bits of the address
 into LEGAL_BIG_CONST part of the offset is going to lead to a better
 code sequence.  

 So I think it would be more sensible to just return false if we have a
 DImode address with the bottom 2 bits non-zero and then let the normal
 reload recovery mechanism take over.

 It is supposed to provide better utilization of the full range of
 LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not sure, but I guess reload
 will resolve it with the reg+LEGAL_BIG_CONST part only, using only (mem
 (reg)) for the load/store (correct me if I'm wrong).

 Also, the new code slighty improves the reloading, for example currently
 [reg+64] is broken into [(reg+72)-8], creating an additional unneeded
 reload, which is certainly not good when we have ldrd/strd available.

 
 Sorry, didn't mean to suggest that we don't want to do something better
 for addresses that are a multiple of 4, just that for addresses that
 aren't (at least) word-aligned that we should just bail as the code in
 that case won't benefit from the optimization.  So something like
 
if (mode == DImode || (mode == DFmode

Re: Cleaning up expand optabs code

2011-03-30 Thread Richard Sandiford

Richard Henderson r...@redhat.com writes:
 On 03/29/2011 06:21 AM, Richard Sandiford wrote:
 -  enum machine_mode mode0 = insn_data[(int) icode].operand[1].mode;
 -  enum machine_mode mode1 = insn_data[(int) icode].operand[2].mode;
 -  enum machine_mode tmp_mode;
 +  enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
 +  enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
 +  enum machine_mode mode0, mode1, tmp_mode;
 ...
 -  if (GET_MODE (xop0) != mode0  mode0 != VOIDmode)
 -xop0 = convert_modes (mode0,
 -  GET_MODE (xop0) != VOIDmode
 -  ? GET_MODE (xop0)
 -  : mode,
 -  xop0, unsignedp);
 -
 -  if (GET_MODE (xop1) != mode1  mode1 != VOIDmode)
 -xop1 = convert_modes (mode1,
 -  GET_MODE (xop1) != VOIDmode
 -  ? GET_MODE (xop1)
 -  : mode,
 -  xop1, unsignedp);
 +  mode0 = GET_MODE (xop0) != VOIDmode ? GET_MODE (xop0) : mode;
 +  if (xmode0 != VOIDmode  xmode0 != mode0)
 +{
 +  xop0 = convert_modes (xmode0, mode0, xop0, unsignedp);
 +  mode0 = xmode0;
 +}
 +
 +  mode1 = GET_MODE (xop1) != VOIDmode ? GET_MODE (xop1) : mode;
 +  if (xmode1 != VOIDmode  xmode1 != mode1)
 +{
 +  xop1 = convert_modes (xmode1, mode1, xop1, unsignedp);
 +  mode1 = xmode1;
 +}

 This smells like a target bug, but I can't quite put my finger
 on exactly what's wrong, and I haven't debugged the original.

 The backend has said xmode[01] is what it expects.  The only
 reason you wouldn't write xmode[01] in the create_input_operand
 line is if xmode[01] are VOIDmode.

Right.  Sorry, I should have said, but the failure was actually from:

case EXPAND_INPUT:
input:
  gcc_assert (mode != VOIDmode);

expand_binop_direct passed the match_operand's mode to create_input_operand,
which was really the opposite of the intention.  The caller should be
passing the mode of the rtx, which it should always know.

 That seems like mistake number one, particularly for a binop,
 but I'll accept that for the nonce.  Which pattern triggered
 the problem in the target?

It was ashrdi3:

(define_expand ashrdi3
  [(set (match_operand:DI 0 register_operand )
(ashiftrt:DI (match_operand:DI 1 register_operand )
 (match_operand 2 const_int_operand )))]
  !TARGET_COLDFIRE

Which seems reasonable in this context, since the shift count isn't
really DImode.

 Then we've got the conditionalization in the init of mode[01],
 which is presumably to handle CONST_INT as an input.

 What about something like

   xmode0 = insn_data...
   if (xmode0 == VOIDmode)
 xmode0 = mode;

   mode0 = GET_MODE (xop0);
   if (mode0 == VOIDmode)
 mode0 = mode;

   if (xmode0 != mode0)
 convert_modes

   create_input_operand(..., xmode0)

 ?  That seems more obvious than what you have.  And I *think*
 it should produce the same results.  If it doesn't, then this
 whole block of code needs a lot more explanation.

The problem is that a VOIDmode match_operand doesn't necessary imply
that mode is the right mode.  VOIDmode register-accepting operands
are only a warning, not an error, and are sometimes needed for
flag-specific variations.  They've traditionally not forced a conversion
to mode.  E.g. if you have something like this on a 32-bit target:

  unsigned long long foo (unsigned long long x, int y)
  {
return x = y;
  }

then op1 will be (reg:SI y).  Having a VOIDmode match_operand shouldn't
force that to be converted to (reg:DI y), whereas I think the sequence
above would.

Or to put it another way: as things stand, mode is only trustworthy
for CONST_INT opNs.  A VOIDmode match_operand doesn't imply that the
opN argument to expand_binop_directly is a CONST_INT, or even that
only CONST_INTs are acceptable to the target.

This comes back to the point that we really should know up-front what
modes op0 and op1 actually have.  (The thing I left as a future clean-up.)
Until then, the process implemented by yesterday's patch was supposed to be:

   - work out what mode opN actually has at this point in time
   - see if the target has specifically asked for a different mode
   - if so, convert the operand

This is directly equivalent to what create_convert_operand_from does:

case EXPAND_CONVERT_FROM:
  if (GET_MODE (op-value) != VOIDmode)
mode = GET_MODE (op-value);
  else
/* The caller must tell us what mode this value has.  */
gcc_assert (mode != VOIDmode);

  imode = insn_data[(int) icode].operand[opno].mode;
  if (imode != VOIDmode  imode != mode)
{
  op-value = convert_modes (imode, mode, op-value, op-unsigned_p);
  mode = imode;
}

But we have to break the flow there (rather than go on to coerce the operands)
because of the commutative thing.

Richard

Re: Some remodelling of the ARM vld and vst patterns

2011-03-30 Thread Richard Sandiford

Richard Sandiford richard.sandif...@linaro.org writes:
   The ??? is saying that the V8QI-derived MEM is really a 3-byte access,
   not a 4-byte (SI) access, and so on.  The comment makes the mode sound
   like a representational niceity, but really, there's no such thing as
   a conservatively wrong memory size here.  If a store's mode is too
   small, dependent loads could be deleted as dead.  If it's too big,
   unrelated live loads could be deleted as dead.

In case it isn't obvious, I meant unrelated live stores.

Richard

[RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen

Hi,

With this patch a data-ref is marked as unconditionally read or
written also if its adjacent field is read or written unconditionally
in the loop.
My concern is that this is not safe enough, even though the fields
have to be non-pointers and non-aggregates, and this optimization is
applied only with -ftree-loop-if-convert-stores flag.

Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

OK for trunk?

Thanks,
Ira


ChangeLog:

 * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
 if an adjacent field of the data-ref is accessed unconditionally.

testsuite/ChangeLog:

 * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
 * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
 -ftree-loop-if-convert-stores.
Index: testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
===
--- testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0)
+++ testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0)
@@ -0,0 +1,69 @@
+/* { dg-require-effective-target vect_int } */
+
+#include stdarg.h
+#include tree-vect.h
+
+#define N 50
+
+typedef struct {
+  short a;
+  short b;
+} data;
+
+data in1[N], in2[N], out[N];
+short result[N*2] = 
{10,-7,11,-6,12,-5,13,-4,14,-3,15,-2,16,-1,17,0,18,1,19,2,20,3,21,4,22,5,23,6,24,7,25,8,26,9,27,10,28,11,29,12,30,13,31,14,32,15,33,16,34,17,35,18,36,19,37,20,38,21,39,22,40,23,41,24,42,25,43,26,44,27,45,28,46,29,47,30,48,31,49,32,50,33,51,34,52,35,53,36,54,37,55,38,56,39,57,40,58,41,59,42};
+short out1[N], out2[N];
+
+__attribute__ ((noinline)) void
+foo ()
+{
+  int i;
+  short c, d;
+
+  for (i = 0; i  N; i++)
+{
+  c = in1[i].b;
+  d = in2[i].b;
+
+  if (c = d)
+{
+  out[i].b = in1[i].a;
+  out[i].a = d + 5;
+}
+  else
+{
+  out[i].b = d - 12;
+  out[i].a = in2[i].a + d;
+}
+}
+}
+
+int
+main (void)
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i  N; i++)
+{
+  in1[i].a = i;
+  in1[i].b = i + 2;
+  in2[i].a = 5;
+  in2[i].b = i + 5;
+  __asm__ volatile ();
+}
+
+  foo ();
+
+  for (i = 0; i  N; i++)
+{
+  if (out[i].a != result[2*i] || out[i].b != result[2*i+1])
+abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  { xfail { 
vect_no_align || {! vect_strided } } } } } */
+/* { dg-final { cleanup-tree-dump vect } } */
Index: testsuite/gcc.dg/vect/vect.exp
===
--- testsuite/gcc.dg/vect/vect.exp  (revision 171716)
+++ testsuite/gcc.dg/vect/vect.exp  (working copy)
@@ -210,6 +210,12 @@ lappend DEFAULT_VECTCFLAGS --param gg
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/ggc-*.\[cS\]]]  \
  $DEFAULT_VECTCFLAGS
 
+# -ftree-loop-if-convert-stores
+set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
+lappend DEFAULT_VECTCFLAGS -ftree-loop-if-convert-stores
+dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/if-cvt-stores-vect-*.\[cS\]]]  \
+ $DEFAULT_VECTCFLAGS
+
 # With -O3.
 # Don't allow IPA cloning, because it throws our counts out of whack.
 set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
Index: tree-if-conv.c
===
--- tree-if-conv.c  (revision 171716)
+++ tree-if-conv.c  (working copy)
@@ -461,11 +461,11 @@ struct ifc_dr {
 #define DR_WRITTEN_AT_LEAST_ONCE(DR) (IFC_DR (DR)-written_at_least_once)
 #define DR_RW_UNCONDITIONALLY(DR) (IFC_DR (DR)-rw_unconditionally)
 
-/* Returns true when the memory references of STMT are read or written
-   unconditionally.  In other words, this function returns true when
-   for every data reference A in STMT there exist other accesses to
-   the same data reference with predicates that add up (OR-up) to the
-   true predicate: this ensures that the data reference A is touched
+/* Returns true when the memory references of STMT (or their adjacent fields)
+   are read or writtenunconditionally.  In other words, this function
+   returns true when for every data reference A in STMT there exist other
+   accesses to the same data reference with predicates that add up (OR-up) to
+   the true predicate: this ensures that the data reference A is touched
(read or written) on every iteration of the if-converted loop.  */
 
 static bool
@@ -479,7 +479,7 @@ memrefs_read_or_written_unconditionally 
   for (i = 0; VEC_iterate (data_reference_p, drs, i, a); i++)
 if (DR_STMT (a) == stmt)
   {
-   bool found = false;
+   bool found = false, same_comp_ref = false;
int x = DR_RW_UNCONDITIONALLY (a);
 
if (x == 0)
@@ -489,22 +489,45 @@ memrefs_read_or_written_unconditionally 
  continue;
 
for (j = 0; VEC_iterate (data_reference_p, drs, j, b); j++)
- if (DR_STMT (b) != stmt
-  same_data_refs (a,

Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Richard Earnshaw


On Wed, 2011-03-30 at 15:35 +0800, Chung-Lin Tang wrote:
 On 2011/3/30 上午 12:23, Richard Earnshaw wrote:
  
  On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
  On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
  On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
  On 2011/3/24 06:51 PM, Richard Earnshaw wrote:
 
  On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
  Hi,
  PR48250 happens under TARGET_NEON, where DImode is included within the
  valid NEON modes. This turns the range of legitimate constant indexes 
  to
  step-4 (coproc load/store), thus arm_legitimize_reload_address() when
  trying to decompose the [reg+index] reload address into
  [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
  part is not aligned to 4.
 
  I'm not sure why the current DImode index is computed as:
  low = ((val  0xf) ^ 0x8) - 0x8;  the sign-extending into negative
  values, then subtracting back, actually creates further off indexes.
  e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].
 
 
  Hysterical Raisins... the code there was clearly written for the days
  before we had LDRD in the architecture.  At that time the most efficient
  way to load a 64-bit object was to use the LDM{ia,ib,da,db}
  instructions.  The computation here was (I think), intended to try and
  make the most efficient use of an add/sub instruction followed by
  LDM/STM offsetting.  At that time the architecture had no unaligned
  access either, so dealing with 64-bit that were less than 32-bit aligned
  (in those days 32-bit was the maximum alignment) probably wasn't
  considered, or couldn't even get through to reload.
 
 
  I see it now. The code in output_move_double() returning assembly for
  ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.
 
  I have changed the patch to let the new code handle the TARGET_LDRD case
  only.  The pre-LDRD case is still handled by the original code, with an
  additional  ~0x3 for aligning the offset to 4.
 
  I've also added a comment for the pre-TARGET_LDRD case. Please see if
  the description is accurate enough.
 
  My patch changes the index decomposing to a more straightforward way; 
  it
  also sort of outlines the way the other reload address indexes are
  broken by using and-masks, is not the most effective.  The address is
  computed by addition, subtracting away the parts to obtain low+high
  should be the optimal way of giving the largest computable index range.
 
  I have included a few Thumb-2 bits in the patch; I know currently
  arm_legitimize_reload_address() is only used under TARGET_ARM, but I
  guess it might eventually be turned into TARGET_32BIT.
 
 
  I think this needs to be looked at carefully on ARMv4/ARMv4T to check
  that it doesn't cause regressions there when we don't have LDRD in the
  instruction set.
 
  I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
  Okay for trunk if no regressions?
 
  Thanks,
  Chung-Lin
 
   PR target/48250
   * config/arm/arm.c (arm_legitimize_reload_address): Adjust
   DImode constant index decomposing under TARGET_LDRD. Clear
   lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
   comment for !TARGET_LDRD case.
 
  This looks technically correct, but I can't help feeling that trying to
  deal with the bottom two bits being non-zero is not really worthwhile.
  This hook is an optimization that's intended to generate better code
  than the default mechanisms that reload provides.  It is allowed to
  fail, but it must say so if it does (by returning false).
 
  What this hook is trying to do for DImode is to take an address of the
  form (reg + TOO_BIG_CONST) and turn it into two instructions that are
  legitimate:
 
tmp = (REG + LEGAL_BIG_CONST)
some_use_of (mem (tmp + SMALL_LEGAL_CONST))
 
  The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
  instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
  impossible in ARM state) that pushing the bottom two bits of the address
  into LEGAL_BIG_CONST part of the offset is going to lead to a better
  code sequence.  
 
  So I think it would be more sensible to just return false if we have a
  DImode address with the bottom 2 bits non-zero and then let the normal
  reload recovery mechanism take over.
 
  It is supposed to provide better utilization of the full range of
  LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not sure, but I guess reload
  will resolve it with the reg+LEGAL_BIG_CONST part only, using only (mem
  (reg)) for the load/store (correct me if I'm wrong).
 
  Also, the new code slighty improves the reloading, for example currently
  [reg+64] is broken into [(reg+72)-8], creating an additional unneeded
  reload, which is certainly not good when we have ldrd/strd available.
 
  
  Sorry, didn't mean to suggest that we don't want to do something better
  for addresses that are a multiple of 4, just that for addresses that
  aren't (at least) word-aligned that

Re: [PATCH] Another EQ/NE folder type fix (PR c/48305)

2011-03-30 Thread Richard Guenther

On Tue, Mar 29, 2011 at 5:53 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 I've left these 4 out because I thought they should be fine given
 that the other operands are equal.  But the testcase shows I was wrong.

 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
 for trunk?

Ok.

Thanks,
Richard.

 2011-03-29  Jakub Jelinek  ja...@redhat.com

        PR c/48305
        * fold-const.c (fold_binary_loc) case EQ_EXPR, NE_EXPR: Make sure
        arg10/arg11 in (X ^ Y) == (Z ^ W) are always fold converted to
        matching arg00/arg01 types.

        * gcc.c-torture/compile/pr48305.c: New test.

 --- gcc/fold-const.c.jj 2011-03-25 07:56:32.0 +0100
 +++ gcc/fold-const.c    2011-03-28 12:45:58.0 +0200
 @@ -12625,13 +12625,21 @@ fold_binary_loc (location_t loc,
             operand_equal_p guarantees no side-effects so we don't need
             to use omit_one_operand on Z.  */
          if (operand_equal_p (arg01, arg11, 0))
 -           return fold_build2_loc (loc, code, type, arg00, arg10);
 +           return fold_build2_loc (loc, code, type, arg00,
 +                                   fold_convert_loc (loc, TREE_TYPE (arg00),
 +                                                     arg10));
          if (operand_equal_p (arg01, arg10, 0))
 -           return fold_build2_loc (loc, code, type, arg00, arg11);
 +           return fold_build2_loc (loc, code, type, arg00,
 +                                   fold_convert_loc (loc, TREE_TYPE (arg00),
 +                                                     arg11));
          if (operand_equal_p (arg00, arg11, 0))
 -           return fold_build2_loc (loc, code, type, arg01, arg10);
 +           return fold_build2_loc (loc, code, type, arg01,
 +                                   fold_convert_loc (loc, TREE_TYPE (arg01),
 +                                                     arg10));
          if (operand_equal_p (arg00, arg10, 0))
 -           return fold_build2_loc (loc, code, type, arg01, arg11);
 +           return fold_build2_loc (loc, code, type, arg01,
 +                                   fold_convert_loc (loc, TREE_TYPE (arg01),
 +                                                     arg11));

          /* Optimize (X ^ C1) op (Y ^ C2) as (X ^ (C1 ^ C2)) op Y.  */
          if (TREE_CODE (arg01) == INTEGER_CST
 --- gcc/testsuite/gcc.c-torture/compile/pr48305.c.jj    2011-03-28 
 12:48:02.0 +0200
 +++ gcc/testsuite/gcc.c-torture/compile/pr48305.c       2011-03-28 
 12:47:49.0 +0200
 @@ -0,0 +1,7 @@
 +/* PR c/48305 */
 +
 +int
 +foo (int x)
 +{
 +  return (x ^ 1) == (x ^ 1U);
 +}

        Jakub

Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Richard Guenther

On Tue, Mar 29, 2011 at 7:00 PM, Aldy Hernandez al...@redhat.com wrote:
 [Language lawyers, please correct me if I have mis-interpreted the upcoming
 standard in any way.]

 In the C++ memory model, contiguous bitfields comprise a single memory
 location, so it's fair game to bit twiddle them when setting them.  For
 example:

        struct {
                unsigned int a : 4;
                unsigned int b : 4;
                unsigned int c : 4;
        };

 In the above example, you can touch b and c while setting a.  No race
 there.

 However, non contiguous bitfields are a different story:

        struct {
                unsigned int a : 4;
                char b;
                unsigned int c : 6;
        };

 Here we have 3 distinct memory locations, so you can't touch b or c
 while setting a.  No bit twiddling allowed.

 Similarly for bitfields separated by a zero-length bitfield:

        struct {
                unsigned int a : 4;
                int : 0;
                unsigned int c : 6;
        };

 In the above example, a and c are distinct memory locations.

 Also, a structure/union boundary will also separate previously contiguous
 bit sequences:

        struct {
                unsigned int a : 4;
                struct { unsigned int b : 4 } BBB;
                unsigned int c : 4;
        };

 Here we have 3 distinct memory locations, so again, we can't clobber b or
 c while setting a.

 The patch below adds a non-contiguous bit test (bitfields-2.C) which passes
 on x86, but upon assembly inspection, fails on PPC64, s390, and Alpha.
  These 3 architectures bit-twiddle their way out of the problem.

The memory model is not implementable on strict-alignment targets
that do not have a byte store operation.  But we previously said that ;)

Also consider global vars

char a;
char b;

accessing them on strict-align targets may access adjacent globals
(that's a problem anyway, also with alias analysis).

Richard.

 There is also a similar test already in the testsuite (bitfields.C) which is
 similar except one field is a volatile.  This test fails on x86-64 as well
 and is the subject of PR48128 which Jakub is currently tackling.

 As soon as Jakub finishes with PR48128, I will be working on getting these
 bitfield tests working in a C++ memory model fashion.

 Committing to branch.

Re: RFA: patch to solve IRA PR48336, PR48342, PR48345

2011-03-30 Thread Richard Sandiford

Hi Vlad,

First, I want to echo H-P's thanks for tackling this area.  I just wondered:

Vladimir Makarov vmaka...@redhat.com writes:
 The following patch is to solve PR48336, PR48342, PR48345.  The 
 profitable hard regs exclude hard regs which are prohibited for the 
 corresponding allocno mode. It is true for primary allocation and it is 
 important for better colorability criteria.  Function assign_hard_reg is 
 also based on this assumption.  Unfortunately, it is not true for 
 secondary allocation (after IRA IR flattening or during reload).  The 
 following patch solves this problem.

 The patch should be very safe but I am still testing it on x86/x86-64 
 bootstrap.
[...]
 Index: ira-color.c
 ===
 --- ira-color.c   (revision 171699)
 +++ ira-color.c   (working copy)
 @@ -1447,7 +1447,9 @@ update_conflict_hard_regno_costs (int *c
  }
  
  /* Set up conflicting and profitable regs (through CONFLICT_REGS and
 -   PROFITABLE_REGS) for each object of allocno A.  */
 +   PROFITABLE_REGS) for each object of allocno A.  Remember that the
 +   profitable regs exclude hard regs which can not hold value of mode
 +   of allocno A.  */
  static inline void
  setup_conflict_profitable_regs (ira_allocno_t a, bool retry_p,
   HARD_REG_SET *conflict_regs,
 @@ -1463,8 +1465,13 @@ setup_conflict_profitable_regs (ira_allo
COPY_HARD_REG_SET (conflict_regs[i],
OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
if (retry_p)
 - COPY_HARD_REG_SET (profitable_regs[i],
 -reg_class_contents[ALLOCNO_CLASS (a)]);
 + {
 +   COPY_HARD_REG_SET (profitable_regs[i],
 +  reg_class_contents[ALLOCNO_CLASS (a)]);
 +   AND_COMPL_HARD_REG_SET (profitable_regs[i],
 +   ira_prohibited_class_mode_regs
 +   [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
 + }
else
   COPY_HARD_REG_SET (profitable_regs[i],
  OBJECT_COLOR_DATA (obj)-profitable_hard_regs);

ira_prohibited_class_mode_regs is partly based on HARD_REGNO_MODE_OK,
which is really a property of the first register in a multi-register
group (rather than of every register in that group).  So is
ira_prohibited_class_mode_regs defined in the same way?
At the moment, check_hard_reg_p and setup_allocno_available_regs_num
test profitability for every register in the group:

/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
|| ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
  break;
[...]
  for (j = 0; j  nregs; j++)
{
[...]
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
  || ! TEST_HARD_REG_BIT (obj_data-profitable_hard_regs,
  hard_regno + j))
break;

So if you have a target in which double-word values have to start in
even registers, I think every odd-numbered bit of profitable_hard_regs
will be clear, and no register will seem profitable.  (I'm seeing this
on ARM with some VFP tests.)

Restricting the test to the first register fixes things for me,
but do we need to check something else for the j != 0 case?

Richard


gcc/
* ira-color.c (check_hard_reg_p): Restrict the profitability check
to the first register.
(setup_allocno_available_regs_num): Likewise.

Index: gcc/ira-color.c
===
--- gcc/ira-color.c (revision 171653)
+++ gcc/ira-color.c (working copy)
@@ -1497,7 +1504,8 @@ check_hard_reg_p (ira_allocno_t a, int h
   for (k = set_to_test_start; k  set_to_test_end; k++)
/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
-   || ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
+   || (j == 0
+! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno)))
  break;
   if (k != set_to_test_end)
break;
@@ -2226,8 +2234,9 @@ setup_allocno_available_regs_num (ira_al
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
- || ! TEST_HARD_REG_BIT (obj_data-profitable_hard_regs,
- hard_regno + j))
+ || (j == 0
+  ! TEST_HARD_REG_BIT (obj_data-profitable_hard_regs,
+ hard_regno)))
break;
}
  if (k != set_to_test_end)

PATCH: PR target/48349: FLOAT_SSE_REGS typo in i386.h

2011-03-30 Thread H.J. Lu

Hi,

I checked this pre-approved patch into trunk/4.6/4.5/4.4.


H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 171717)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-03-30  H.J. Lu  hongjiu...@intel.com
+
+   PR target/48349
+   * config/i386/i386.h (REG_CLASS_CONTENTS): Fix a typo in
+   FLOAT_SSE_REGS.
+
 2011-03-30  Joseph Myers  jos...@codesourcery.com
Rainer Orth  r...@cebitec.uni-bielefeld.de
 
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 171717)
+++ config/i386/i386.h  (working copy)
@@ -1272,7 +1272,7 @@ enum reg_class
 { 0xe000,0x1f },   /* MMX_REGS */  \
 { 0x1fe00100,0x1fe000 },   /* FP_TOP_SSE_REG */\
 { 0x1fe00200,0x1fe000 },   /* FP_SECOND_SSE_REG */ \
-{ 0x1fe0ff00,0x3fe000 },   /* FLOAT_SSE_REGS */\
+{ 0x1fe0ff00,0x1fe000 },   /* FLOAT_SSE_REGS */\
{ 0x1,  0x1fe0 },   /* FLOAT_INT_REGS */\
 { 0x1fe100ff,0x1fffe0 },   /* INT_SSE_REGS */  \
 { 0x1fe1,0x1fffe0 },   /* FLOAT_INT_SSE_REGS */\

Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns

2011-03-30 Thread Richard Earnshaw


On Wed, 2011-03-30 at 12:38 +0800, Chung-Lin Tang wrote:
 On 2011/3/30 06:35 AM, Ramana Radhakrishnan wrote:
  Hi Carrot,
  
  On 26/03/11 15:14, Carrot Wei wrote:
  Index: arm.md
  ===
  --- arm.md(revision 171337)
  +++ arm.md(working copy)
  @@ -7115,7 +7115,18 @@
  @
   cmp%?\\t%0, %1
   cmn%?\\t%0, #%n1
  -  [(set_attr conds set)]
  +  [(set_attr conds set)
  +   (set (attr length)
  + (if_then_else
  +   (and (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
  + (eq (symbol_ref which_alternative) (const_int 0)))
  +(ior (ne (symbol_ref REG_P (operands[1])) (const_int 0))
  +(and (ne (symbol_ref CONST_INT_P (operands[1])) (const_int 0))
  + (and (ge (symbol_ref INTVAL (operands[1])) (const_int 0))
  +  (le (symbol_ref INTVAL (operands[1]))
  +  (const_int 255))
  +   (const_int 2)
  +   (const_int 4)))]
)
  
  How about adding an alternative only enabled for T2 that uses the
  `l' constraint and inventing new constraints for some of the constant
  values that are valid for 16 bit instructions since the `I' and `L'
  constraints have different meanings depending on whether TARGET_32BIT is
  valid or not ? We could then set the value of the length attribute
  accordingly. I don't think we can change the meaning of the I and L
  constraints very easily given the amount of churn that might be needed
  
  
 
(define_insn *cmpsi_shiftsi
  @@ -7286,7 +7297,14 @@
  return \b%d1\\t%l0\;
  
  [(set_attr conds use)
  -   (set_attr type branch)]
  +   (set_attr type branch)
  +   (set (attr length)
  +(if_then_else
  +   (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
  +(and (ge (minus (match_dup 0) (pc)) (const_int -250))
  + (le (minus (match_dup 0) (pc)) (const_int 256
  +   (const_int 2)
  +   (const_int 4)))]
)
  
  I don't think this and the other conditional branch instruction
  changes are correct. This could end up being the last instruction in an
  IT block and will automatically end up getting the 32 bit encoding and
  end up causing trouble with the length calculations. Remember the 16 bit
  encoding for the conditional instruction can't be used as the last
  instruction in an IT block.
  
  
  @@ -10256,7 +10288,26 @@
 
return \\;
  }
  -  [(set_attr type store4)]
  +  [(set_attr type store4)
  +   (set (attr length)
  +(if_then_else
  +   (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
  +(ne (symbol_ref {
  +int i, regno, hi_reg;
  +int num_saves = XVECLEN (operands[2], 0);
  +regno = REGNO (operands[1]);
  +hi_reg = (REGNO_REG_CLASS (regno) == HI_REGS)
  +  (regno != LR_REGNUM);
  +for (i = 1; i  num_saves; i++)
  
  
  (i  num_saves  (hi_reg == 0)) - what's the point of going through
  the register list when hi_reg != 0 in this case ? A comment to explain
  that the length calculation *must* be kept in sync with the template
  above might be useful.
 
 I suggest abstracting all of this Thumb-2 insn length knowledge into a
 arm.c function, say thumb2_16bit_insn_p() to test for 2-byte patterns.
  Adding more of these quite large pieces of logic into arm.md does not
 look very pretty to me, and centralizing it into a C function will also
 make it easier to manage the cases.
 
 Chung-Lin
 

I think Ramana's suggestion should avoid the need for any arithmetic at
all here.  If the constraints precisely match a 16-bit instruction (and
that alternative is disbled for ARM) then we can say unconditionally
that the length of the insn is 2.

R.

Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen

On 30 March 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen ira.ro...@linaro.org wrote:
 Hi,

 With this patch a data-ref is marked as unconditionally read or
 written also if its adjacent field is read or written unconditionally
 in the loop.
 My concern is that this is not safe enough, even though the fields
 have to be non-pointers and non-aggregates, and this optimization is
 applied only with -ftree-loop-if-convert-stores flag.

 Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

 OK for trunk?

 The restrictions do not make too much sense to me.  For the C++
 memory model we can't do speculative stores at all, but for the
 rest I'd say just checking if the data-refs access the same object
 is enough, thus, instead of same_data_refs (a, b) simply check
 operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
 or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
 (DR_REF (b)), 0), whatever makes more sense (I always confuse what
 the contents of the various DR fields are).

But what about

int a[10], b[100], c[100];
for (i = 0; i  100; i++)
  {
 if (i  10)
   t = a[i];
else
   t = b[i];

c[i] = t + a[1];
   }

We can't transform this to

int a[10], b[100], c[100];
for (i = 0; i  100; i++)
  {
 t1 = a[i];
 t2 = b[i];
 t = (i  10) ? t1 : t2;
 c[i] = t + a[1];
   }

since we create accesses to a[10:100], right?

Thanks,
Ira


 Thanks,
 Richard.

 Thanks,
 Ira


 ChangeLog:

         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
 true
         if an adjacent field of the data-ref is accessed unconditionally.

 testsuite/ChangeLog:

         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
         -ftree-loop-if-convert-stores.

Negative option alias facility

2011-03-30 Thread Joseph S. Myers

This patch adds a NegativeAlias .opt facility, for one option to be
marked as an alias for the negative form of another option.  There are
quite a lot of options like that, and this is of particular use where
it allows option handler code to be removed because the aliases can
now be handled purely in the .opt file (as opposed to the case where
both positive and negative options are fully described by their Var
settings).

In particular, this patch also converts the legacy rs6000 options
-mvrsave=, -misel=, -mspe= to aliases for the standard forms of the
options (via listing the =yes and =no forms of each option as separate
aliases), so simplifying the rs6000 option handling code.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu, and
tested building cc1 and xgcc for cross to powerpc-eabispe.  Will
commit to trunk in the absence of target maintainer objections.

2011-03-30  Joseph Myers  jos...@codesourcery.com

* doc/options.texi (NegativeAlias): Document.
(Alias): Mention NegativeAlias.
* opt-functions.awk: Handle NegativeAlias.
* optc-gen.awk: Disallow NegativeAlias with multiple Alias
arguments.
* opts-common.c (decode_cmdline_option): Handle CL_NEGATIVE_ALIAS.
* opts.h (CL_NEGATIVE_ALIAS): Define.
* config/rs6000/rs6000.c (rs6000_parse_yes_no_option): Remove.
(rs6000_handle_option): Don't handle OPT_mvrsave_, OPT_misel_ and
OPT_mspe_.
* config/rs6000/rs6000.opt (mvrsave=, misel=, mspe=): Replace with
Alias entries.
* config/rs6000/t-spe (MULTILIB_OPTIONS, MULTILIB_EXCEPTIONS): Use
mno-spe and mno-isel instead of mspe=no and -misel=no.

Index: gcc/doc/options.texi
===
--- gcc/doc/options.texi(revision 171701)
+++ gcc/doc/options.texi(working copy)
@@ -364,7 +364,8 @@ for later processing.
 @item Alias(@var{opt})
 @itemx Alias(@var{opt}, @var{arg})
 @itemx Alias(@var{opt}, @var{posarg}, @var{negarg})
-The option is an alias for @option{-@var{opt}}.  In the first form,
+The option is an alias for @option{-@var{opt}} (or the negative form
+of that option, depending on @code{NegativeAlias}).  In the first form,
 any argument passed to the alias is considered to be passed to
 @option{-@var{opt}}, and @option{-@var{opt}} is considered to be
 negated if the alias is used in negated form.  In the second form, the
@@ -387,6 +388,13 @@ not need to handle it and no @samp{OPT_}
 for it; only the canonical form of the option will be seen in those
 places.
 
+@item NegativeAlias
+For an option marked with @code{Alias(@var{opt})}, the option is
+considered to be an alias for the positive form of @option{-@var{opt}}
+if negated and for the negative form of @option{-@var{opt}} if not
+negated.  @code{NegativeAlias} may not be used with the forms of
+@code{Alias} taking more than one argument.
+
 @item Ignore
 This option is ignored apart from printing any warning specified using
 @code{Warn}.  The option will not be seen by specs and no @samp{OPT_}
Index: gcc/opts-common.c
===
--- gcc/opts-common.c   (revision 171701)
+++ gcc/opts-common.c   (working copy)
@@ -507,6 +507,7 @@ decode_cmdline_option (const char **argv
{
  gcc_assert (option-alias_arg != NULL);
  gcc_assert (arg == NULL);
+ gcc_assert (!(option-flags  CL_NEGATIVE_ALIAS));
  if (value)
arg = option-alias_arg;
  else
@@ -517,9 +518,13 @@ decode_cmdline_option (const char **argv
{
  gcc_assert (value == 1);
  gcc_assert (arg == NULL);
+ gcc_assert (!(option-flags  CL_NEGATIVE_ALIAS));
  arg = option-alias_arg;
}
 
+ if (option-flags  CL_NEGATIVE_ALIAS)
+   value = !value;
+
  opt_index = new_opt_index;
  option = new_option;
 
Index: gcc/opts.h
===
--- gcc/opts.h  (revision 171701)
+++ gcc/opts.h  (working copy)
@@ -115,6 +115,7 @@ extern const unsigned int cl_lang_count;
 #define CL_MISSING_OK  (1U  28) /* Missing argument OK (joined).  */
 #define CL_UINTEGER(1U  29) /* Argument is an integer =0.  */
 #define CL_UNDOCUMENTED(1U  30) /* Do not output with 
--help.  */
+#define CL_NEGATIVE_ALIAS  (1U  31) /* Alias to negative form of option. 
 */
 
 /* Flags for an enumerated option argument.  */
 #define CL_ENUM_CANONICAL  (1  0) /* Canonical for this value.  */
Index: gcc/optc-gen.awk
===
--- gcc/optc-gen.awk(revision 171701)
+++ gcc/optc-gen.awk(working copy)
@@ -362,6 +362,11 @@ for (i = 0; i  n_opts; i++) {
print #error Alias with single argument  \

Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen

On 30 March 2011 14:41, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 30 March 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen ira.ro...@linaro.org wrote:
 Hi,

 With this patch a data-ref is marked as unconditionally read or
 written also if its adjacent field is read or written unconditionally
 in the loop.
 My concern is that this is not safe enough, even though the fields
 have to be non-pointers and non-aggregates, and this optimization is
 applied only with -ftree-loop-if-convert-stores flag.

 Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

 OK for trunk?

 The restrictions do not make too much sense to me.  For the C++
 memory model we can't do speculative stores at all, but for the
 rest I'd say just checking if the data-refs access the same object
 is enough, thus, instead of same_data_refs (a, b) simply check
 operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
 or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
 (DR_REF (b)), 0), whatever makes more sense (I always confuse what
 the contents of the various DR fields are).

 But what about

 int a[10], b[100], c[100];
 for (i = 0; i  100; i++)
  {
     if (i  10)
       t = a[i];
    else
       t = b[i];

    c[i] = t + a[1];
   }

 We can't transform this to

 int a[10], b[100], c[100];
 for (i = 0; i  100; i++)
  {
     t1 = a[i];
     t2 = b[i];
     t = (i  10) ? t1 : t2;
     c[i] = t + a[1];
   }

 since we create accesses to a[10:100], right?

 That's correct - we may only ever create valid accesses, but any
 valid access to a is ok after we see an unconditional access to a.

 So we probably have to restrict ourselves to DECL_P (get_base_address ())
 objects and make sure we don't access past it.

 Thus, I see what you were trying to do - may I suggest sth like

  ref_base_a = DR_REF (a);
  while (TREE_CODE (ref_base_a) == COMPONENT_REF
           || TREE_CODE (ref_base_a) == IMAGPART_EXPR
           || TREE_CODE (ref_base_a) == REALPART_EXPR)
    ref_base_a = TREE_OPERAND (ref_base_a, 0);

 ... same for DR_REF (b) ...

  if (operand_equal_p (ref_base_a, ref_base_b, 0))
    ok

 that is, strip all non-variable offset handled_component_refs and compare
 the remaining base of the two accesses.  If they are equal we are ok.

 Any hole in that?

I don't see any :) I'll test your version.

Thanks,
Ira


 Thanks,
 Richard.

 Thanks,
 Ira


 Thanks,
 Richard.

 Thanks,
 Ira


 ChangeLog:

         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
 true
         if an adjacent field of the data-ref is accessed unconditionally.

 testsuite/ChangeLog:

         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
         -ftree-loop-if-convert-stores.

RX: Sync with 4.6 branch

2011-03-30 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to sync the RX backend with the 4.6
  branch.  In practice this means bringing in the peepholes to combine
  extending loads and simple arithmetic expressions, and adjusting the
  memory move cost cost for stores.  (The insn length attribute is
  needed for the rx_max_skip_for_label function).

Cheers
  Nick

gcc/ChangeLog
2011-03-30  Nick Clifton  ni...@redhat.com

* config/rx/rx.md: Add peepholes and patterns to combine
extending loads and simple arithmetic instructions.
* config/rx/rx.h (ADJUST_INSN_LENGTH): Define.
* config/rx/rx-protos.h (rx_adjust_insn_length): Prototype.
* config/rx/rx.c (rx_is_legitimate_address): Allow QI and HI
modes to use pre-decrement and post-increment addressing.
(rx_is_restricted_memory_address): Add range checking of REG+INT
addresses.
(rx_print_operand): Add support for %Q.
Fix handling of %Q.
(rx_memory_move_cost): Adjust cost of stores.
(rx_adjust_insn_length): New function.

Index: gcc/config/rx/rx.h
===
--- gcc/config/rx/rx.h  (revision 171716)
+++ gcc/config/rx/rx.h  (working copy)
@@ -630,3 +630,10 @@
 #define REGISTER_MOVE_COST(MODE,FROM,TO) 2
 
 #define SELECT_CC_MODE(OP,X,Y)  rx_select_cc_mode(OP, X, Y)
+
+#define ADJUST_INSN_LENGTH(INSN,LENGTH)\
+  do   \
+{  \
+  (LENGTH) = rx_adjust_insn_length ((INSN), (LENGTH)); \
+}  \
+  while (0)
Index: gcc/config/rx/rx-protos.h
===
--- gcc/config/rx/rx-protos.h   (revision 171716)
+++ gcc/config/rx/rx-protos.h   (working copy)
@@ -31,16 +31,17 @@
 extern int rx_initial_elimination_offset (int, int);
 
 #ifdef RTX_CODE
+extern int rx_adjust_insn_length (rtx, int);
 extern void rx_emit_stack_popm (rtx *, bool);
 extern void rx_emit_stack_pushm (rtx *);
 extern voidrx_expand_epilogue (bool);
 extern char *  rx_gen_move_template (rtx *, bool);
 extern boolrx_is_legitimate_constant (rtx);
 extern boolrx_is_restricted_memory_address (rtx, Mmode);
+extern boolrx_match_ccmode (rtx, Mmode);
 extern voidrx_notice_update_cc (rtx body, rtx insn);
 extern voidrx_split_cbranch (Mmode, Rcode, rtx, rtx, rtx);
 extern Mmode   rx_select_cc_mode (Rcode, rtx, rtx);
-extern boolrx_match_ccmode (rtx, Mmode);
 #endif
 
 #endif /* GCC_RX_PROTOS_H */
Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 171716)
+++ gcc/config/rx/rx.md (working copy)
@@ -1545,6 +1545,139 @@
(set_attr length 3,4,5,6,7,6)]
 )
 
+;; A set of peepholes to catch extending loads followed by arithmetic 
operations.
+;; We use iterators where possible to reduce the amount of typing and hence the
+;; possibilities for typos.
+
+(define_code_iterator extend_types [(zero_extend ) (sign_extend )])
+(define_code_attr letter   [(zero_extend R) (sign_extend Q)])
+
+(define_code_iterator memex_commutative [(plus ) (and ) (ior ) (xor )])
+(define_code_iterator memex_noncomm [(div ) (udiv ) (minus )])
+(define_code_iterator memex_nocc[(smax ) (smin ) (mult )])
+
+(define_code_attr op[(plus add) (and and) (div div) 
(udiv divu) (smax max) (smin min) (mult mul) (ior or) (minus sub) 
(xor xor)])
+
+(define_peephole2
+  [(set (match_operand:SI   0 register_operand)
+   (extend_types:SI (match_operand:small_int_modes 1 
rx_restricted_mem_operand)))
+   (parallel [(set (match_operand:SI2 register_operand)
+  (memex_commutative:SI (match_dup 0)
+(match_dup 2)))
+ (clobber (reg:CC CC_REG))])]
+  peep2_regno_dead_p (2, REGNO (operands[0]))
+  [(parallel [(set:SI (match_dup 2)
+ (memex_commutative:SI (match_dup 2)
+   (extend_types:SI (match_dup 1
+ (clobber (reg:CC CC_REG))])]
+)
+
+(define_peephole2
+  [(set (match_operand:SI   0 register_operand)
+   (extend_types:SI (match_operand:small_int_modes 1 
rx_restricted_mem_operand)))
+   (parallel [(set (match_operand:SI2 register_operand)
+  (memex_commutative:SI (match_dup 2)
+(match_dup 0)))
+ (clobber (reg:CC CC_REG))])]
+  peep2_regno_dead_p (2, REGNO (operands[0]))
+  [(parallel [(set:SI (match_dup 2)
+ (memex_commutative:SI (match_dup 2)
+

Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Aldy Hernandez




The memory model is not implementable on strict-alignment targets
that do not have a byte store operation.  But we previously said that ;)


Yes.  I think we should issue an error when we have such a target and 
the user tries -fmemory-model=c++0x.  However, how many strict-alignment 
targets are not byte addressable nowadays?



Also consider global vars

char a;
char b;

accessing them on strict-align targets may access adjacent globals
(that's a problem anyway, also with alias analysis).


Good point.  I am adding a test to that effect (see attached patch).

BTW, I assume you mean strict-align targets WITHOUT byte-addressability 
as above.  I have spot-checked your scenario on a handful of important 
targets that have strict alignment, and all of them work without 
touching adjacent global vars:


arm-elf OK
sparc-linux OK
ia64-linux  OK
alpha-linux OK, but only with -mbwx (byte addressability)

rth tells me that we shouldn't worry about ancient non-byte addressable 
Alphas, so the last isn't an issue.


So... do you have any important targets in mind, because I don't see 
this being a problem for most targets?  As can be expected, I am only 
interested in x86*, powerpc*, and s390, especially since a cursory 
glance on other important targets didn't exhibit any problems.  However, 
given my target bias, I am willing to look into any important targets 
that are problematic (I'm hoping none :)).


Let me know if you see anything else, and please take a quick peek at 
the attached patch below, which I will be committing shortly.


As usual, thanks.
Aldy
Index: testsuite/gcc.dg/memmodel/strict-align-global.c
===
--- testsuite/gcc.dg/memmodel/strict-align-global.c (revision 0)
+++ testsuite/gcc.dg/memmodel/strict-align-global.c (revision 0)
@@ -0,0 +1,46 @@
+/* { dg-do link } */
+/* { dg-options -O2 --param allow-packed-store-data-races=0 } */
+/* { dg-final { memmodel-gdb-test } } */
+
+#include stdio.h
+#include memmodel.h
+
+/* This test verifies writes to globals do not write to adjacent
+   globals.  This mostly happens on strict-align targets that are not
+   byte addressable (old Alphas, etc).  */
+
+char a = 0;
+char b = 77;
+
+void memmodel_other_threads() 
+{
+}
+
+int memmodel_step_verify()
+{
+  if (b != 77)
+{
+  printf(FAIL: Unexpected value.  b is %d, should be 77\n, b);
+  return 1;
+}
+  return 0;
+}
+
+/* Verify that every variable has the correct value.  */
+int memmodel_final_verify()
+{
+  int ret = memmodel_step_verify ();
+  if (a != 66)
+{
+  printf(FAIL: Unexpected value.  a is %d, should be 66\n, a);
+  return 1;
+}
+  return ret;
+}
+
+int main ()
+{
+  a = 66;
+  memmodel_done();
+  return 0;
+}

Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Richard Guenther

On Wed, Mar 30, 2011 at 4:11 PM, Aldy Hernandez al...@redhat.com wrote:

 The memory model is not implementable on strict-alignment targets
 that do not have a byte store operation.  But we previously said that ;)

 Yes.  I think we should issue an error when we have such a target and the
 user tries -fmemory-model=c++0x.  However, how many strict-alignment targets
 are not byte addressable nowadays?

 Also consider global vars

 char a;
 char b;

 accessing them on strict-align targets may access adjacent globals
 (that's a problem anyway, also with alias analysis).

 Good point.  I am adding a test to that effect (see attached patch).

 BTW, I assume you mean strict-align targets WITHOUT byte-addressability as
 above.  I have spot-checked your scenario on a handful of important targets
 that have strict alignment, and all of them work without touching adjacent
 global vars:

        arm-elf         OK
        sparc-linux     OK
        ia64-linux      OK
        alpha-linux     OK, but only with -mbwx (byte addressability)

 rth tells me that we shouldn't worry about ancient non-byte addressable
 Alphas, so the last isn't an issue.

 So... do you have any important targets in mind, because I don't see this
 being a problem for most targets?  As can be expected, I am only interested
 in x86*, powerpc*, and s390, especially since a cursory glance on other
 important targets didn't exhibit any problems.  However, given my target
 bias, I am willing to look into any important targets that are problematic
 (I'm hoping none :)).

Well, I'm not sure that strict-align targets that provide byte access do
not simply hide the issue inside the CPU (thus, perform the read-modify-write
there and do not guarantee any atomicity unless you ask for it).  It might
be even worse - targets might not even guarantee this for shared cache-lines
(for non-ccNUMA architectures).  But I'm no expert here, but certainly
every possible weird CPU architecture has been implemented.

Richard.

Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Richard Earnshaw


On Tue, 2011-03-29 at 20:55 +, Joseph S. Myers wrote:
 This patch continues cleaning up the toplevel configure.ac, in particular 
 as regards cases handling GCC libraries for targets where GCC is no longer 
 (or never was) supported.
 
 The principle there, as discussed for the original deprecated targets 
 removal patch, is that in the absence of GCC support for a target it 
 doesn't matter exactly how toplevel is configured to build GCC libraries 
 (or other GCC-related directories) for that target - builds will and 
 should fail if GCC is included in the source tree - so toplevel 
 configure.ac should be set up for maximal simplicity.  In most cases that 
 means no explicit mention of GCC libraries for such a target; in some 
 cases it means removing the target entry altogether so that another case 
 becomes active instead (possibly the *-*-* case that disables libgcj 
 only).
 
 In general cases just disabling libgcj are redundant with the *-*-* case 
 unless they prevent some other case being active for the target, and so 
 are removed.  There were also a couple of removed cases for targets with 
 no support in GCC or src at all (romp-*-*, vax-*-vms) - if there's no 
 support in GCC or src, it's right that a build should fail trying to build 
 whatever sources you have there, and silly for toplevel to try disabling 
 everything because nothing is supported.
 
 Other cleanups here: noconfigdirs=$noconfigdirs is completely useless 
 and was removed; c54x*-*-* is always mapped to tic54x-*-* by config.sub so 
 that case can be simplified; cris*-*-none acts just like cris*-*-elf in 
 config.gcc so it's appropriate to make the * subcase of cris*-*-* act 
 like the -elf case; mmix-*-* disabled libgloss, i.e. libgloss for the 
 host, which is never built anyway.
 
 OK to commit?
 
 2011-03-29  Joseph Myers  jos...@codesourcery.com
 

   (arm-*-coff): Don't disable libgcj.
   (arm*-*-linux-gnueabi): Remove useless assignment.
   (arm-*-riscix*): Don't disable libgcj.


RISC iX support was removed from GCC years ago.  Looks like a tiny
fragment left over that wasn't cleaned up.  The other bits are fine with
me.

R.

Re: [PATCH: ARM] Replace define_constants with define_c_enum for unspec/unspec_volatile in backend

2011-03-30 Thread Richard Earnshaw


On Tue, 2011-01-18 at 16:56 +, Yufeng Zhang wrote:
 This patch replaces define_constants in the ARM backend with
 define_c_enum for defining the available indexes for the
 unspecs/unspecvs expressions. This improves the readability of the
 intermediate dumps for machine-specific operations. For instance, given
 the following c code:
 
 int foo (char* s);
 extern char hello[];
 
 void test(int x)
 {
   int y = x + 4;
   foo(hello);
 }
 
 If compiled with -c -fpic, the diff of the -fdump-rtl-expand dump
 between pre-patch and post-patch is:
 
 @@ -24,10 +24,10 @@
  (const:SI (unspec:SI [
  (const:SI (plus:SI (unspec:SI [
  (const_int 0 [0])
 -] 21)
 +] UNSPEC_PIC_LABEL)
  (const_int 8 [0x8])))
 -] 24))
 -] 3)) test.c:43 -1
 +] UNSPEC_GOTSYM_OFF))
 +] UNSPEC_PIC_SYM)) test.c:43 -1
   (nil))
 
  (insn 10 9 11 2 (set (reg:SI 137)
 @@ -35,7 +35,7 @@
  (reg:SI 137)
  (const_int 8 [0x8])
  (const_int 0 [0])
 -] 4)) test.c:43 -1
 +] UNSPEC_PIC_BASE)) test.c:43 -1
   (nil))
 
  (insn 11 10 5 2 (use (reg:SI 137)) test.c:43 -1
 
 Having the indexes defined by define_c_enum, GCC will use the
 enumerators' names rather than some magic numbers when printing out
 unspec/unspec_volatile expressions.
 
 The patch does not change the code generation. The post-patch test
 result is the same as the pre-patch result (tested on qemu for armv7-a
 and arm-eabi).
 
 OK for the trunk?
 
 Cheers,
 Yufeng
 
 
 2011-01-18  Yufeng Zhang  yufeng.zh...@arm.com
 
 * config/arm/arm.md (define_constants for unspec): Replace with
 define_c_enum.
 (define_constants for unspecv): Replace with define_c_enum.
 * config/arm/neon.md (define_constants for unspec): Replace with
 define_c_enum.

OK.

R.

[Patch] bfin: fix profiling

2011-03-30 Thread Henderson, Stuart

The attached patch allows long jumps to __mcount, defines 
PROFILE_BEFORE_PROLOGUE and ensures ASM_OUTPUT_REG_PUSH pre-decrements the 
stack pointer.


2011-03-30  Stuart Henderson  stuart.hender...@analog.com

From Bernd Schmidt
* config/bfin/bfin.h (FUNCTION_PROFILER): Take TARGET_LONG_CALLS into
account and save/restore RETS.
(PROFILE_BEFORE_PROLOGUE): Define.
(ASM_OUTPUT_REG_PUSH, ASM_OUTPUT_REG_POP): Add tab character.  Correct
the push insn to use predecrement.


Thanks,
Stu


upstream.patch
Description: upstream.patch

[PATCH, Fortran] Correct declaration of frexp and friends

2011-03-30 Thread Duncan Sands


While working on the dragonegg plugin I noticed that the Fortran front-end
declares frexp with the parameters the wrong way round.  Instead of
  double frexp(double x, int *exp);
it is declared as
  double frexp(int *exp, double x);
This is fairly harmless but might as well be fixed, so here is a patch (as far
as I can see fntype[4] is only used in declaring the frexp family of functions).
Bootstraps and has no impact on the Fortran testsuite (tested on mainline).  OK
to apply on mainline and the 4.5 and 4.6 branches?

Proposed fortran/Changelog entry:
2011-03-30  Duncan Sands  baldr...@free.fr

* f95-lang.c (build_builtin_fntypes): Swap frexp parameter types.


Index: gcc/fortran/f95-lang.c
===
--- gcc/fortran/f95-lang.c  (revision 171716)
+++ gcc/fortran/f95-lang.c  (working copy)
@@ -695,10 +695,9 @@
 type, integer_type_node, NULL_TREE);
   /* type (*) (void) */
   fntype[3] = build_function_type_list (type, NULL_TREE);
-  /* type (*) (int, type) */
-  fntype[4] = build_function_type_list (type,
+  /* type (*) (type, int) */
+  fntype[4] = build_function_type_list (type, type,
 build_pointer_type (integer_type_node),
-type,
 NULL_TREE);
   /* type (*) (int, type) */
   fntype[5] = build_function_type_list (type,

Re: [ARM] Define unspecs using define_c_enum

2011-03-30 Thread Richard Earnshaw


On Tue, 2011-03-29 at 15:38 +0100, Richard Sandiford wrote:
 This ARM patch allows *UNSPEC_* constants to be printed in dump files.
 It's very much a target decision whether this is worth doing, but just
 in case...
 
 ...tested on arm-linux-gnueabi.  OK to install?
 
 Richard
 
 
 gcc/
   * config/arm/neon.md: Use define_c_enum to define UNSPEC* and
   VUNSPEC* constants.
   * config/arm/arm.md: Likewise.  Move the synchronization ones to...
   * config/arm/sync.md: ...here.
 


Hmm, seems we have two patches posted to do this.  I've just approved an
earlier post that came in while we were in pre-release lock-down.

R.

Re: fix for 48208 and 48260 on darwin

2011-03-30 Thread Mike Stump

On Mar 30, 2011, at 5:41 AM, gcchelp.5.ad...@0sg.net wrote:
 Tests that now fail, but worked before:
 
 g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, line 7)
 
 Tests that now work, but didn't before:
 
 g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, line 7)

Yeah, someone thought it would be funny to have two test cases with the same 
exact name, and then to have one of them fail and the other one work, so, 
technically, what it said is true.

I've now applied the patch for you.  In the future, by sure to attach the diffs 
as a file, as Mail eats white spacing in odd ways which corrupts patches.

Thanks.


2011-03-30  Christian SchÃ¼ler  cschue...@gmx.de

PR/driver 48208
* config/c.opt (F): Added 'Driver' to -F option.
PR/driver 48260
* config/darwin-driver.c (darwin_driver_init): Add '-arch' to handler 
function.
* config/darwin.opt: Added '-arch' option.

Index: c-family/c.opt
===
--- c-family/c.opt  (revision 171691)
+++ c-family/c.opt  (working copy)
@@ -201,7 +201,7 @@
 C ObjC C++ ObjC++ Undocumented Var(flag_preprocess_only)
 
 F
-C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after %qs)
+Driver C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after 
%qs)
 -F dir   Add dir to the end of the main framework include path
 
 H
Index: config/darwin.opt
===
--- config/darwin.opt   (revision 171691)
+++ config/darwin.opt   (working copy)
@@ -31,6 +31,9 @@
 allowable_client
 Driver Separate Alias(Zallowable_client)
 
+arch
+Driver RejectNegative Separate
+
 arch_errors_fatal
 Driver Alias(Zarch_errors_fatal)
 
Index: config/darwin-driver.c
===
--- config/darwin-driver.c  (revision 171691)
+++ config/darwin-driver.c  (working copy)
@@ -161,6 +161,15 @@
continue;
   switch ((*decoded_options)[i].opt_index)
{
+#if DARWIN_X86
+   case OPT_arch:
+ if (!strcmp ((*decoded_options)[i].arg, i386))
+   generate_option (OPT_m32, NULL, 1, CL_DRIVER, 
(*decoded_options)[i]);
+ else if (!strcmp ((*decoded_options)[i].arg, x86_64))
+   generate_option (OPT_m64, NULL, 1, CL_DRIVER, 
(*decoded_options)[i]);
+ break;
+#endif
+
case OPT_filelist:
case OPT_framework:
  ++*decoded_options_count;

Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread Richard Guenther

On Wed, Mar 30, 2011 at 4:23 PM, Jeff Law l...@redhat.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 As originally discussed here:

 http://gcc.gnu.org/ml/gcc-patches/2010-05/msg00987.html

 Changes since the original submission:

 Per Richi's suggestion I changed the patch to use a GC'd VEC.  I defined
 accessor macros so that a ton of reformatting isn't needed and finally I
 made the appropriate tweaks to the few backends that peeked at the
 reg_equiv arrays.

 Given the changes and the length of time since the original submission,
 I think it's probably best to get a fresh approval rather than rely on
 the prior approval.

Looks good to me.

Thanks,
Richard.

 Bootstrapped and regression tested on x86_64-unknown-linux-gnu.

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)
 Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

 iQEcBAEBAgAGBQJNkzznAAoJEBRtltQi2kC7VgYH/0yTkwWkDRyalReMk3whdIXO
 8qw5H6c9Gz+Yj+EnnPySKsFIvJKMBTQHIpdCzjTCVWD/Z7LSJwERzzlNCPrQu2au
 dpOoUYCTAwgSW0Us9B+2Bcf2DABinYLV+hgKAKFEVi98CheZe3hZZ14lm5mlYDec
 INYKyfqYHmyahT8fa6ABY2kp0X2xhQhJ0VAGPI34kytJpgLpIdtRwq6PsdsPM0PJ
 frLAY5xIEmJqBB30RaPqnD07u06xZHi+S9gfAJa4LJTUqVALNusYdZzZajKMtF3i
 BCXK4UFk+J2MlM9xZkVWqQiryLc6arVT2bMQcvz7tXTMZkY6ZdZ7sKFBgKea6vM=
 =iMqQ
 -END PGP SIGNATURE-

Re: [ARM] Define unspecs using define_c_enum

2011-03-30 Thread Richard Sandiford

Richard Earnshaw rearn...@arm.com writes:
 On Tue, 2011-03-29 at 15:38 +0100, Richard Sandiford wrote:
 This ARM patch allows *UNSPEC_* constants to be printed in dump files.
 It's very much a target decision whether this is worth doing, but just
 in case...
 
 ...tested on arm-linux-gnueabi.  OK to install?
 
 Richard
 
 
 gcc/
  * config/arm/neon.md: Use define_c_enum to define UNSPEC* and
  VUNSPEC* constants.
  * config/arm/arm.md: Likewise.  Move the synchronization ones to...
  * config/arm/sync.md: ...here.
 


 Hmm, seems we have two patches posted to do this.  I've just approved an
 earlier post that came in while we were in pre-release lock-down.

Oops, sorry, I missed Yufeng's patch.  FWIW, the differences in my patch
are that I've got rid of the (documented as) unused UNSPEC_SIN and
UNSPEC_COS and moved the sync stuff to sync.md.  I assumed the old
code did things that way in order to make the numbering more obvious,
but that isn't an issue any more.

Richard

Re: PING [patch, testsuite] Fix a short-enums testsuite error

2011-03-30 Thread Richard Earnshaw


On Fri, 2011-02-18 at 10:56 +, Ian Bolton wrote:
  Is the following patch OK for trunk?
  
  
  2011-01-26  Ian Bolton  ian.bol...@arm.com
  
  * testsuite/gcc.dg/20100906-1.c: Use -fno-short-enums
  option for target arm_eabi.
  
  
  Index: gcc/testsuite/gcc.dg/20100906-1.c
  ===
  --- gcc/testsuite/gcc.dg/20100906-1.c   (revision 168543)
  +++ gcc/testsuite/gcc.dg/20100906-1.c   (working copy)
  @@ -1,5 +1,6 @@
   /* { dg-do run } */
   /* { dg-options -O2 } */
  +/* { dg-options -O2 -fno-short-enums -Wl,--no-enum-size-warning
  {target
  arm_eabi} } */
  
   /* This testcase got misoptimized by combine due to a wrong setting of
  subst_low_luid in try_combine.  */
 
 PING.
 
 

OK.

R.

Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Mike Stump

On Mar 30, 2011, at 7:40 AM, Richard Guenther wrote:
 Is forcing word-alignment too big of a hammer, or will the users for these
 architectures be content with having no support for the C++0x memory model?
 
 I think a memory model that cannot be reasonably (read: also fast) implemented
 on all HW is screwed from the start and we should simply ditch it.  Which
 is because nobody will use it as you cannot rely on it when writing
 portable programs or it will be hell slow.

I agree 100%.  If the standards people can't write a decent standard, they 
ought not write it.  I torpedoed someone refining volatile, which would have 
been nice to have, because people were laying tracks down the wrong way.  Nuke 
em from orbit I say.  Now, I'm sure we have it all wrong and the standard is 
entirely reasonable...  right?

Re: Cleaning up expand optabs code

2011-03-30 Thread Richard Henderson

On 03/30/2011 01:53 AM, Richard Sandiford wrote:
 This comes back to the point that we really should know up-front what
 modes op0 and op1 actually have.  (The thing I left as a future clean-up.)
 Until then, the process implemented by yesterday's patch was supposed to be:
 
- work out what mode opN actually has at this point in time
- see if the target has specifically asked for a different mode
- if so, convert the operand

Ok, I get it.  The patch is ok as-is.


r~

Re: [PATCH][ARM] Tweak arm_class_likely_spilled_p, MODE_BASE_REG_CLASS for Thumb-2

2011-03-30 Thread Richard Earnshaw


On Mon, 2011-02-14 at 14:20 +, Andrew Stubbs wrote:
 This patch is a rework of an old one:
 
 http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01080.html
 
 The ARM parts of that patch were approved, but the target independent 
 parts were never reviewed (AFAICT), and the patch no longer applies.
 
 I've updated the target-specific parts. As far as I can tell, the target 
 independent parts are no longer required, so I've dropped them.
 
 Tested with no regressions for ARM mode and Thumb2 mode.
 
 OK?
 
 Andrew

2011-02-14  Andrew Stubbs  a...@codesourcery.com
Julian Brown  jul...@codesourcery.com
Mark Shinwell  shinw...@codesourcery.com

gcc/
* config/arm/arm.h (arm_class_likely_spilled_p): Check against
LO_REGS only for Thumb-1.
(MODE_BASE_REG_CLASS): Restrict base registers to those which can
be used in short instructions when optimising for size on Thumb-2.

OK.

R.

Re: [PATCH][ARM] Limit thumb2 test to thumb2 targets

2011-03-30 Thread Richard Earnshaw


On Thu, 2011-02-10 at 15:29 +, Andrew Stubbs wrote:
 I'm posting this patch on behalf of Jie.
 
 The patch simply prevents a test failure on non-thumb2 targets.
 
 OK?
 
 Andrew


2010-09-30  Jie Zhang  j...@codesourcery.com

gcc/testsuite/
* gcc.target/arm/neon-thumb2-move.c: Add
dg-require-effective-target arm_thumb2_ok.

Just in case this never got checked in, OK.

R.

Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Joseph S. Myers

On Wed, 30 Mar 2011, Richard Earnshaw wrote:

  2011-03-29  Joseph Myers  jos...@codesourcery.com
  
 
  (arm-*-coff): Don't disable libgcj.
  (arm*-*-linux-gnueabi): Remove useless assignment.
  (arm-*-riscix*): Don't disable libgcj.
 
 
 RISC iX support was removed from GCC years ago.  Looks like a tiny
 fragment left over that wasn't cleaned up.  The other bits are fine with
 me.

The RISC iX case is much like the Solaris/PowerPC one I commented on in 
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01744.html: it appears 
there is still assembler support.  What I'm doing here is simply cleaning 
things up by removing mentions of *GCC* directories for targets without 
GCC support (often ones where GCC support was removed long ago).

It's quite possible that some obsolete targets should have their binutils 
(and newlib) support removed, leading to further toplevel simplification 
(gdb has been more active in pruning unmaintained targets - but it's also 
the case that a proliferation of targets is more likely to cause problems 
there than in binutils and newlib).  But I think it's really for the 
binutils maintainers to work out what they think is worth removing.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Make renumber_gimple_stmt_uids and renumber_gimple_stmt_uids_in_bbs consistent

2011-03-30 Thread Richard Guenther


As people noted one computes UIDs for PHIs and one doesn't.  Now I'm
working on sth that needs it for PHIs, thus the following fix.
We have to exclude LTO as it re-builds PHIs and gets confused with
inconsistent numbering.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-03-30  Richard Guenther  rguent...@suse.de

* tree-dfa.c (renumber_gimple_stmt_uids): Also number PHIs.
* lto-streamer-out.c (output_function): Do not use
renumber_gimple_stmt_uids.
* lto-streamer-in.c (input_function): Likewise.

Index: gcc/tree-dfa.c
===
--- gcc/tree-dfa.c  (revision 171716)
+++ gcc/tree-dfa.c  (working copy)
@@ -151,6 +151,11 @@ renumber_gimple_stmt_uids (void)
   FOR_ALL_BB (bb)
 {
   gimple_stmt_iterator bsi;
+  for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (bsi))
+   {
+ gimple stmt = gsi_stmt (bsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
   for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
{
  gimple stmt = gsi_stmt (bsi);
Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 171716)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -1981,8 +2031,19 @@ output_function (struct cgraph_node *nod
   /* We will renumber the statements.  The code that does this uses
  the same ordering that we use for serializing them so we can use
  the same code on the other end and not have to write out the
- statement numbers.  */
-  renumber_gimple_stmt_uids ();
+ statement numbers.  We do not assign UIDs to PHIs here because
+ virtual PHIs get re-computed on-the-fly which would make numbers
+ inconsistent.  */
+  set_gimple_stmt_max_uid (cfun, 0);
+  FOR_ALL_BB (bb)
+{
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
+}
 
   /* Output the code for the function.  */
   FOR_ALL_BB_FN (bb, fn)
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c   (revision 171716)
+++ gcc/lto-streamer-in.c   (working copy)
@@ -1287,7 +1287,16 @@ input_function (tree fn_decl, struct dat
 
   /* Fix up the call statements that are mentioned in the callgraph
  edges.  */
-  renumber_gimple_stmt_uids ();
+  set_gimple_stmt_max_uid (cfun, 0);
+  FOR_ALL_BB (bb)
+{
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
+}
   stmts = (gimple *) xcalloc (gimple_stmt_max_uid (fn), sizeof (gimple));
   FOR_ALL_BB (bb)
 {

[PATCH] Fix expansion issues on type changing MEM_REFs on LHS (PR middle-end/48335)

2011-03-30 Thread Jakub Jelinek

Hi!

MEM_REFs which can represent type punning on lhs don't force
non-gimple types to be addressable.  This causes various problems
in the expander, which wasn't prepared to handle that.

This patch tries to fix what I've found and adds a bunch of
testcases.  The original report was with just -O2 on some large testcase
from Eigen, most of the testcases have -fno-tree-sra just because
I've given up on delta when it stopped reducing at 32KB.

The first problem (the one from Eigen) is _mm_store_pd into
a std::complexdouble var, which is a single field and thus
has DCmode TYPE_MODE.  As starting with 4.6 that var is not
TREE_ADDRESSABLE, its DECL_RTL is a CONCAT, and for assignments
to concat expand_assignment was expecting either that
from has COMPLEX_TYPE (and matching mode to the store), or
that it is a real or imaginary subpart store, thus when
trying to store a V2DF mode value it expected it to be
real part store (bitpos == 0) and tried to set a DFmode pseudo
from V2DFmode rtx.
Further testing revealed that it is possible to hit many other
cases with CONCAT destination, it can be a store of just a few bits,
or can overlap parts of both real and imaginary, or be partially
out of bounds.
The patch handles the case from Eigen - bitpos == 0 bitsize ==
GET_MODE_BITSIZE (GET_MODE (to_rtx)) non-COMPLEX_TYPE by setting
each half separately, if it is a store which is not touching
one of the parts by just adjusting bitpos for the imaginary
case and storing just to one of the parts (this is
the bitpos + bitsize  half_bitsize resp. bitpos = half_bitsize
case) and finally adds a generic slow one for the very unusual
cases with partial overlap of both.

After testing it with the testcases I wrote, I found a bunch of
other ICEs though, and reproduced them even without CONCAT
on the LHS (the testcases below which don't contain any _Complex
keyword).

Bootstrapped/regtested on x86_64-linux and i686-linux,
regtested with a cross compiler on these new testcases also
for powerpc{,64}-linux and s390{,x}-linux.

Ok for trunk and after a while for 4.6?

2011-03-30  Jakub Jelinek  ja...@redhat.com

PR middle-end/48335
* expr.c (expand_assignment): Handle all possibilities
if TO_RTX is CONCAT.
* expmed.c (store_bit_field_1): Avoid trying to create
invalid SUBREGs.
(store_split_bit_field): If SUBREG_REG (op0) or
op0 itself has smaller mode than word, return it
for offset 0 and const0_rtx for out of bounds stores.
If word is const0_rtx, skip it.

* gcc.c-torture/compile/pr48335-1.c: New test.
* gcc.dg/pr48335-1.c: New test.
* gcc.dg/pr48335-2.c: New test.
* gcc.dg/pr48335-3.c: New test.
* gcc.dg/pr48335-4.c: New test.
* gcc.dg/pr48335-5.c: New test.
* gcc.dg/pr48335-6.c: New test.
* gcc.dg/pr48335-7.c: New test.
* gcc.dg/pr48335-8.c: New test.
* gcc.target/i386/pr48335-1.c: New test.

--- gcc/expr.c.jj   2011-03-23 17:15:55.0 +0100
+++ gcc/expr.c  2011-03-30 11:38:15.0 +0200
@@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b
   /* Handle expand_expr of a complex value returning a CONCAT.  */
   else if (GET_CODE (to_rtx) == CONCAT)
{
- if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from
+ unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx));
+ if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))
+  bitpos == 0
+  bitsize == mode_bitsize)
+   result = store_expr (from, to_rtx, false, nontemporal);
+ else if (bitsize == mode_bitsize / 2
+   (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)))
+   result = store_expr (from, XEXP (to_rtx, bitpos != 0), false,
+nontemporal);
+ else if (bitpos + bitsize = mode_bitsize / 2)
+   result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ mode1, from, TREE_TYPE (tem),
+ get_alias_set (to), nontemporal);
+ else if (bitpos = mode_bitsize / 2)
+   result = store_field (XEXP (to_rtx, 1), bitsize,
+ bitpos - mode_bitsize / 2, mode1, from,
+ TREE_TYPE (tem), get_alias_set (to),
+ nontemporal);
+ else if (bitpos == 0  bitsize == mode_bitsize)
{
- gcc_assert (bitpos == 0);
- result = store_expr (from, to_rtx, false, nontemporal);
+ rtx from_rtx = expand_normal (from);
+ from_rtx = simplify_gen_subreg (GET_MODE (to_rtx), from_rtx,
+ TYPE_MODE (TREE_TYPE (from)), 0);
+ emit_move_insn (XEXP (to_rtx, 0),
+ read_complex_part (from_rtx, false));
+ emit_move_insn (XEXP (to_rtx, 1),
+ read_complex_part

A small patch to fix an arm bootstrap failure

2011-03-30 Thread Vladimir Makarov


The following patch is to fix an arm bootstrap failure described in

http://gcc.gnu.org/ml/gcc/2011-03/msg00499.html

when variable mode is set but not used because it is used when macro 
HONOR_REG_ALLOC_ORDER is defined.


I found that 2 variables mode in different scopes is defined in function 
assign_hard_reg.  Using only one variable will guarantee usage of the 
variable.


The patch has been committed as obvious after successful bootstrap on 
x86_64.



2011-03-30  Vladimir Makarov vmaka...@redhat.com

* ira-color.c (ira_assign_hard_reg): Use only one variable 'mode'.

Index: ira-color.c
===
--- ira-color.c (revision 171713)
+++ ira-color.c (working copy)
@@ -1622,10 +1622,11 @@ assign_hard_reg (ira_allocno_t a, bool r
  if (hard_regno = 0
   ira_class_hard_reg_index[aclass][hard_regno] = 0)
{
- enum machine_mode mode = ALLOCNO_MODE (conflict_a);
- int conflict_nregs = hard_regno_nregs[hard_regno][mode];
  int n_objects = ALLOCNO_NUM_OBJECTS (conflict_a);
+ int conflict_nregs;
 
+ mode = ALLOCNO_MODE (conflict_a);
+ conflict_nregs = hard_regno_nregs[hard_regno][mode];
  if (conflict_nregs == n_objects  conflict_nregs  1)
{
  int num = OBJECT_SUBWORD (conflict_obj);

Re: ira-improv patch has been committed

2011-03-30 Thread H.J. Lu

On Mon, Mar 28, 2011 at 6:11 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 On 03/27/2011 07:25 PM, Vladimir Makarov wrote:

  I submitted the following patch.  The patch contains original
 patches from ira-improv branch, changes addressing all Keneth Zadeck's
 comments in http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00457.html,
 and a minor change fixing few regression on gcc testsuite for x86/x86-64
 and
 ppc.

  The patch was successfully tested and bootstrapped on x86/x86-64,
 ppc64, and ia64.

 Here is another try to commit the patch.  The previous bootstrap failure was
 because of wrong assertion which fails in extremely rare case (multi-object
 allocnos which got hard reg set nodes different from the allocno profitable
 hard regs and the corresponding node hard regs sets happened not be
 intersected).  The fix is very small and described by the first ChangeLog
 entry.

 The patch was bootstrapped on x86/x86-64, ppc64, ia64, and i686 with options
 of H.J.'s automatic tester.

 Committed as revision 171649.

 2011-03-28  Vladimir Makarov vmaka...@redhat.com

    * ira-color.c (update_left_conflict_sizes_p): Don't assume that
    conflict object hard regset nodes have intersecting hard reg sets.

    * regmove.c (regmove_optimize): Move ira_set_pseudo_classes call
    after regstat_init_n_sets_and_refs.

    * ira.c: Add more comments at the top.
    (setup_stack_reg_pressure_class, setup_pressure_classes):
    Add comments how we compute the register pressure classes.
    (setup_allocno_and_important_classes): Add more comments.
    (setup_class_translate_array, reorder_important_classes)
    (setup_reg_class_relations): Add comments.

    * ira-emit.c: Add 2011 to the Copyright line.  Add comments at the
    start of the file.

    * ira-color.c: Add 2011 to the Copyright line.
    (assign_hard_reg):  Add more comments.
    (improve_allocation): Ditto.

    * ira-costs.c: Add 2011 to the Copyright line.
    (setup_cost_classes, setup_regno_cost_classes_by_aclass): Add more
    comments.
    (setup_regno_cost_classes_by_mode): Ditto.

    Initial patches from ira-improv branch:

    2010-08-13  Vladimir Makarov vmaka...@redhat.com

    * ira-build.c: (ira_create_object): Remove initialization of
    OBJECT_PROFITABLE_HARD_REGS.  Initialize OBJECT_ADD_DATA.
    (ira_create_allocno): Remove initialization of
    ALLOCNO_MEM_OPTIMIZED_DEST, ALLOCNO_MEM_OPTIMIZED_DEST_P,
    ALLOCNO_SOMEWHERE_RENAMED_P, ALLOCNO_CHILD_RENAMED_P,
    ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P,
    ALLOCNO_NEXT_BUCKET_ALLOCNO, ALLOCNO_PREV_BUCKET_ALLOCNO,
    ALLOCNO_FIRST_COALESCED_ALLOCNO, ALLOCNO_NEXT_COALESCED_ALLOCNO.
    Initialize ALLOCNO_ADD_DATA.
    (copy_info_to_removed_store_destinations): Use ALLOCNO_EMIT_DATA
    and allocno_emit_reg instead of ALLOCNO_MEM_OPTIMIZED_DEST_P and
    ALLOCNO_REG.
    (ira_flattening): Ditto.  Use ALLOCNO_EMIT_DATA instead of
    ALLOCNO_MEM_OPTIMIZED_DEST and ALLOCNO_SOMEWHERE_RENAMED_P.

    * ira.c (ira_reallocate): Remove.
    (setup_pressure_classes): Call
    ira_init_register_move_cost_if_necessary.  Use
    ira_register_move_cost instead of ira_get_register_move_cost.
    (setup_allocno_assignment_flags): Use ALLOCNO_EMIT_DATA.
    (ira): Call ira_initiate_emit_data and ira_finish_emit_data.

    * ira-color.c: Use ALLOCNO_COLOR_DATA instead of
    ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P,
    ALLOCNO_AVAILABLE_REGS_NUM, ALLOCNO_NEXT_BUCKET_ALLOCNO,
    ALLOCNO_PREV_BUCKET_ALLOCNO. ALLOCNO_TEMP. Use OBJECT_COLOR_DATA
    instead of OBJECT_PROFITABLE_HARD_REGS, OBJECT_HARD_REGS_NODE,
    OBJECT_HARD_REGS_SUBNODES_START, OBJECT_HARD_REGS_SUBNODES_NUM.
    Fix formatting.
    (object_hard_regs_t, object_hard_regs_node_t): Move from
    ira-int.h.
    (struct object_hard_regs, struct object_hard_regs_node): Ditto.
    (struct allocno_color_data): New.
    (allocno_color_data_t): New typedef.
    (allocno_color_data): New definition.
    (ALLOCNO_COLOR_DATA): New macro.
    (struct object_color_data): New.
    (object_color_data_t): New typedef.
    (object_color_data): New definition.
    (OBJECT_COLOR_DATA): New macro.
    (update_copy_costs, calculate_allocno_spill_cost): Call
    ira_init_register_move_cost_if_necessary.  Use
    ira_register_move_cost instead of ira_get_register_move_cost.
    (move_spill_restore, update_curr_costs): Ditto.
    (allocno_spill_priority): Make it inline.
    (color_pass): Allocate and free allocno_color_dat and
    object_color_data.
    (struct coalesce_data, coalesce_data_t): New.
    (allocno_coalesce_data): New definition.
    (ALLOCNO_COALESCE_DATA): New macro.
    (merge_allocnos, coalesced_allocno_conflict_p): Use
    ALLOCNO_COALESCED_DATA instead of ALLOCNO_FIRST_COALESCED_ALLOCNO,
    ALLOCNO_NEXT_COALESCED_ALLOCNO, ALLOCNO_TEMP.
    (coalesce_allocnos): Ditto.
    (setup_coalesced_allocno_costs_and_nums): Ditto.
    (collect_spilled_coalesced_allocnos): Ditto.

Re: RFA: patch to solve IRA PR48336, PR48342, PR48345

2011-03-30 Thread Vladimir Makarov


On 03/30/2011 06:19 AM, Richard Sandiford wrote:

Hi Vlad,

First, I want to echo H-P's thanks for tackling this area.  I just wondered:

Vladimir Makarovvmaka...@redhat.com  writes:

The following patch is to solve PR48336, PR48342, PR48345.  The
profitable hard regs exclude hard regs which are prohibited for the
corresponding allocno mode. It is true for primary allocation and it is
important for better colorability criteria.  Function assign_hard_reg is
also based on this assumption.  Unfortunately, it is not true for
secondary allocation (after IRA IR flattening or during reload).  The
following patch solves this problem.

The patch should be very safe but I am still testing it on x86/x86-64
bootstrap.

[...]

Index: ira-color.c
===
--- ira-color.c (revision 171699)
+++ ira-color.c (working copy)
@@ -1447,7 +1447,9 @@ update_conflict_hard_regno_costs (int *c
  }

  /* Set up conflicting and profitable regs (through CONFLICT_REGS and
-   PROFITABLE_REGS) for each object of allocno A.  */
+   PROFITABLE_REGS) for each object of allocno A.  Remember that the
+   profitable regs exclude hard regs which can not hold value of mode
+   of allocno A.  */
  static inline void
  setup_conflict_profitable_regs (ira_allocno_t a, bool retry_p,
HARD_REG_SET *conflict_regs,
@@ -1463,8 +1465,13 @@ setup_conflict_profitable_regs (ira_allo
COPY_HARD_REG_SET (conflict_regs[i],
 OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
if (retry_p)
-   COPY_HARD_REG_SET (profitable_regs[i],
-  reg_class_contents[ALLOCNO_CLASS (a)]);
+   {
+ COPY_HARD_REG_SET (profitable_regs[i],
+reg_class_contents[ALLOCNO_CLASS (a)]);
+ AND_COMPL_HARD_REG_SET (profitable_regs[i],
+ ira_prohibited_class_mode_regs
+ [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
+   }
else
COPY_HARD_REG_SET (profitable_regs[i],
   OBJECT_COLOR_DATA (obj)-profitable_hard_regs);

ira_prohibited_class_mode_regs is partly based on HARD_REGNO_MODE_OK,
which is really a property of the first register in a multi-register
group (rather than of every register in that group).  So is
ira_prohibited_class_mode_regs defined in the same way?
At the moment, check_hard_reg_p and setup_allocno_available_regs_num
test profitability for every register in the group:

/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
|| ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
  break;
[...]
   for (j = 0; j  nregs; j++)
{
[...]
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
  || ! TEST_HARD_REG_BIT (obj_data-profitable_hard_regs,
  hard_regno + j))
break;

So if you have a target in which double-word values have to start in
even registers, I think every odd-numbered bit of profitable_hard_regs
will be clear, and no register will seem profitable.  (I'm seeing this
on ARM with some VFP tests.)

Restricting the test to the first register fixes things for me,
but do we need to check something else for the j != 0 case?


Richard, thanks very much for pointing this out.  I am agree with you.  
I think your patch is ok.  Although I need some time to check it.  Right 
now I am overwhelmed by # of other bug reports.


Another thing bothering me is a new colorability test for pseudos which 
should start on even or odd hard register.  I have suspicion that it is 
not ok for now.  I need some time to check and think about it.


I hope that I finish my urgent work on bug reports this week and start 
to work on your patch and the another issue on next week.



gcc/
* ira-color.c (check_hard_reg_p): Restrict the profitability check
to the first register.
(setup_allocno_available_regs_num): Likewise.

Index: gcc/ira-color.c
===
--- gcc/ira-color.c (revision 171653)
+++ gcc/ira-color.c (working copy)
@@ -1497,7 +1504,8 @@ check_hard_reg_p (ira_allocno_t a, int h
for (k = set_to_test_start; k  set_to_test_end; k++)
/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
-   || ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
+   || (j == 0
+ ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno)))
  break;
if (k != set_to_test_end)
break;
@@ -2226,8 +2234,9 @@ setup_allocno_available_regs_num (ira_al
  /* Checking only profitable hard regs.  */

Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Tobias Burnus


On 03/30/2011 06:21 PM, Michael Matz wrote:

Okay for trunk?  (regstrapping on x86_64-linux in progress)


OK. Thanks for tracing the bug. I think the issue could be the reason 
for the elusive PR 47516 - thus, you might consider adding that PR to 
the ChangeLog and close the PR afterwards.


Tobias

Re: [Patch] Bfin: Ensure rotrsi and rotlsi don't accept non-const INTVALS

2011-03-30 Thread Richard Henderson

On 03/29/2011 08:49 AM, Henderson, Stuart wrote:
(match_operand:SI 2 immediate_operand )))]

  {
 -  if (INTVAL (operands[2]) != 16)
 +  if (GET_CODE (operands[2]) != CONST_INT || INTVAL (operands[2]) != 16)
  FAIL;

Perhaps use const_int_operand instead of immediate_operand.


r~

Re: [Patch] improve bfin conditional move support

2011-03-30 Thread Richard Henderson

On 03/29/2011 05:57 AM, Henderson, Stuart wrote:
 -  operands[1] = bfin_gen_compare (operands[1], SImode);
 +  operands[1] = bfin_gen_compare (operands[1], GET_MODE (operands[0]));

FWIW, you can use MODEmode to get the proper value
without having to read it from the operand.


r~

Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Michael Matz

Hi,

On Wed, 30 Mar 2011, Tobias Burnus wrote:

 On 03/30/2011 06:21 PM, Michael Matz wrote:
  Okay for trunk?  (regstrapping on x86_64-linux in progress)
 
 OK. Thanks for tracing the bug. I think the issue could be the reason for the
 elusive PR 47516 - thus, you might consider adding that PR to the ChangeLog
 and close the PR afterwards.

(r171736) Hmm, it seems to also be a problem for 4.6.0.  Should I merge it 
to the 4.6 branch too?


Ciao,
Michael.

Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Tobias Burnus


On 03/30/2011 07:33 PM, Michael Matz wrote:

Hi,

On Wed, 30 Mar 2011, Tobias Burnus wrote:


On 03/30/2011 06:21 PM, Michael Matz wrote:

Okay for trunk?  (regstrapping on x86_64-linux in progress)

OK. Thanks for tracing the bug. I think the issue could be the reason for the
elusive PR 47516 - thus, you might consider adding that PR to the ChangeLog
and close the PR afterwards.

(r171736) Hmm, it seems to also be a problem for 4.6.0.  Should I merge it
to the 4.6 branch too?


That's fine with me - the fix is really simple (after one found the 
cause). Fortunately, the bug only rarely surfaces.


Tobias

libgo patch committed: Add missing Makefile dependencies

2011-03-30 Thread Ian Lance Taylor

I committed this patch to libgo to add some missing Makefile
dependencies.  Bootstrapped on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r 58ba48318471 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 08:17:36 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 10:35:55 2011 -0700
@@ -1705,8 +1705,9 @@
 	$(CHECK)
 .PHONY: mime/check
 
-net/net.lo: $(go_net_files) bytes.gox fmt.gox io.gox os.gox reflect.gox \
-		strconv.gox strings.gox sync.gox syscall.gox
+net/net.lo: $(go_net_files) bytes.gox fmt.gox io.gox os.gox rand.gox \
+		reflect.gox strconv.gox strings.gox sync.gox syscall.gox \
+		time.gox
 	$(BUILDPACKAGE)
 net/check: $(CHECK_DEPS)
 	$(CHECK)

C++ PATCH for c++/48281 (ICE with nested initializer_list)

2011-03-30 Thread Jason Merrill

The testcase in 48281 started breaking in 4.6.0 because of my change to 
finish_compound_literal to stop putting constant compound literals into 
static variables, because it was interfering with constexpr.


First I noticed that the crash was due to non-constant CONSTRUCTORs with 
TREE_CONSTANT set.  So I fixed that, in constructor-const.patch.


That change fixed the crash, but then we were just optimizing away the 
entire contents of the lists due to a bug I just filed as PR 48370.


Fixing the bug in general will mean fixing 48370, but we can fix the 
regression by just reverting the change above; I dealt with a similar 
issue elsewhere by setting DECL_DECLARED_CONSTEXPR_P on the temporary 
variable so that constexpr evaluation can look at its initializer, so we 
can use the same approach here.  I'm only doing this for arrays now so 
that we can continue to elide copies of class temporaries.


While I was looking at this stuff, I decided to go ahead and initialize 
the initializer_list objects directly rather than mess with calling the 
constructor.


Tested x86_64-pc-linux-gnu, applying to trunk (all three patches) and 
4.6 (only the first patch).
commit 15ce1df0c9f6259babc885e8947719a586a47103
Author: Jason Merrill ja...@redhat.com
Date:   Wed Mar 30 11:57:50 2011 -0400

PR c++/48281
* semantics.c (finish_compound_literal): Do put static/constant
arrays in static variables.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 9926d26..b88e190 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2329,7 +2329,34 @@ finish_compound_literal (tree type, tree 
compound_literal)
   if (TREE_CODE (type) == ARRAY_TYPE)
 cp_complete_array_type (type, compound_literal, false);
   compound_literal = digest_init (type, compound_literal);
-  return get_target_expr (compound_literal);
+  /* Put static/constant array temporaries in static variables, but always
+ represent class temporaries with TARGET_EXPR so we elide copies.  */
+  if ((!at_function_scope_p () || CP_TYPE_CONST_P (type))
+   TREE_CODE (type) == ARRAY_TYPE
+   initializer_constant_valid_p (compound_literal, type))
+{
+  tree decl = create_temporary_var (type);
+  DECL_INITIAL (decl) = compound_literal;
+  TREE_STATIC (decl) = 1;
+  if (literal_type_p (type)  CP_TYPE_CONST_NON_VOLATILE_P (type))
+   {
+ /* 5.19 says that a constant expression can include an
+lvalue-rvalue conversion applied to a glvalue of literal type
+that refers to a non-volatile temporary object initialized
+with a constant expression.  Rather than try to communicate
+that this VAR_DECL is a temporary, just mark it constexpr.  */
+ DECL_DECLARED_CONSTEXPR_P (decl) = true;
+ DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = true;
+ TREE_CONSTANT (decl) = true;
+   }
+  cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
+  decl = pushdecl_top_level (decl);
+  DECL_NAME (decl) = make_anon_name ();
+  SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl));
+  return decl;
+}
+  else
+return get_target_expr (compound_literal);
 }
 
 /* Return the declaration for the function-name variable indicated by
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist46.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist46.C
new file mode 100644
index 000..2b9f07d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist46.C
@@ -0,0 +1,14 @@
+// PR c++/48281
+// { dg-options -std=c++0x -O2 }
+// { dg-do run }
+
+#include initializer_list
+
+typedef std::initializer_listint  int1;
+typedef std::initializer_listint1 int2;
+static int2 ib = {{42,2,3,4,5},{2,3,4,5,1},{3,4,5,2,1}};
+
+int main()
+{
+  return *(ib.begin()-begin()) != 42;
+}
commit 4069e2ded787088dcf1d4caf1aeccd26e00524c0
Author: Jason Merrill ja...@redhat.com
Date:   Wed Mar 30 10:34:09 2011 -0400

* call.c (convert_like_real) [ck_list]: Build up the
initializer_list object directly.
* decl.c (build_init_list_var_init): Adjust.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index f7d108f..ad2de43 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5467,8 +5467,8 @@ convert_like_real (conversion *convs, tree expr, tree fn, 
int argnum,
tree elttype = TREE_VEC_ELT (CLASSTYPE_TI_ARGS (totype), 0);
tree new_ctor = build_constructor (init_list_type_node, NULL);
unsigned len = CONSTRUCTOR_NELTS (expr);
-   tree array, val;
-   VEC(tree,gc) *parms;
+   tree array, val, field;
+   VEC(constructor_elt,gc) *vec = NULL;
unsigned ix;
 
/* Convert all the elements.  */
@@ -5490,16 +5490,14 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
array = build_array_of_n_type (elttype, len);
array = finish_compound_literal (array, new_ctor);
 
-   parms = make_tree_vector ();
-   VEC_safe_push (tree, gc, parms, decay_conversion (array));
-   VEC_safe_push

RFC: PATCH: Remove 26 element limit in vector

2011-03-30 Thread H.J. Lu

On Wed, Mar 30, 2011 at 08:02:38AM -0700, H.J. Lu wrote:
 Hi,
 
 Currently, we limit XVECEXP to 26 elements in machine description
 since we use letters 'a' to 'z' to encode them.  I don't see any
 reason why we can't go beyond 'z'.  This patch removes this restriction.
 Any comments?
 

That was wrong.  The problem is in vector elements.  This patch passes
bootstrap.  Any comments?

Thanks.


H.J.
---
2011-03-30  H.J. Lu  hongjiu...@intel.com

* genrecog.c (VECTOR_ELEMENT_BASE): New.
(add_to_sequence): Add assert.  Use VECTOR_ELEMENT_BASE to
encode vector elements.
(change_state): Check and support VECTOR_ELEMENT_BASE.
(make_insn_sequence): Add assert.

diff --git a/gcc/genrecog.c b/gcc/genrecog.c
index 74dd0a7..40e9c4d 100644
--- a/gcc/genrecog.c
+++ b/gcc/genrecog.c
@@ -465,6 +465,9 @@ extern void debug_decision
   (struct decision *);
 extern void debug_decision_list
   (struct decision *);
+
+/* The base of vector element.  */
+#define VECTOR_ELEMENT_BASE 0x80
 
 /* Create a new node in sequence after LAST.  */
 
@@ -912,6 +915,7 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
  /* Which insn we're looking at is represented by A-Z. We don't
 ever use 'A', however; it is always implied.  */
 
+ gcc_assert (i  26);
  subpos[depth] = (i  0 ? 'A' + i : 0);
  sub = add_to_sequence (XVECEXP (pattern, 0, i),
 last, subpos, insn_type, 0);
@@ -1002,6 +1006,9 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
char base = (was_code == MATCH_OPERATOR ? '0' : 'a');
for (i = 0; i  (size_t) XVECLEN (pattern, 2); i++)
  {
+   gcc_assert (was_code == MATCH_OPERATOR
+   ? ISDIGIT (i + base)
+   : ISLOWER (i + base));
subpos[depth] = i + base;
sub = add_to_sequence (XVECEXP (pattern, 2, i),
   sub-success, subpos, insn_type, 0);
@@ -1102,7 +1109,9 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
int j;
for (j = 0; j  XVECLEN (pattern, i); j++)
  {
-   subpos[depth] = 'a' + j;
+   int val = j + VECTOR_ELEMENT_BASE;
+   gcc_assert (val = UCHAR_MAX);
+   subpos[depth] = val;
sub = add_to_sequence (XVECEXP (pattern, i, j),
   sub-success, subpos, insn_type, 0);
  }
@@ -1779,6 +1788,10 @@ change_state (const char *oldpos, const char *newpos, 
const char *indent)
   else if (ISLOWER (newpos[depth]))
printf (%sx%d = XVECEXP (x%d, 0, %d);\n,
indent, depth + 1, depth, newpos[depth] - 'a');
+  else if (((unsigned char) newpos[depth]) = VECTOR_ELEMENT_BASE)
+   printf (%sx%d = XVECEXP (x%d, 0, %d);\n,
+   indent, depth + 1, depth,
+   ((unsigned char) newpos[depth]) - VECTOR_ELEMENT_BASE);
   else
printf (%sx%d = XEXP (x%d, %c);\n,
indent, depth + 1, depth, newpos[depth]);
@@ -2528,6 +2541,7 @@ make_insn_sequence (rtx insn, enum routine_type type)
}
   XVECLEN (x, 0) = j;
 
+  gcc_assert ((j - 1)  26);
   c_test_pos[0] = 'A' + j - 1;
   c_test_pos[1] = '\0';
 }

[PATCH] PR debug/47471 (set prologue_end in .debug_line)

2011-03-30 Thread Dodji Seketeli

Hello,

This is about the line program emitted by the compiler into the
.debug_line section, without optimization.  In the example
accompanying the patch below, at the beginning of the function f, the
compiler emits two .loc asm directives that are identical.  The first
one is right before the .cfi_startproc that starts the prologue.  The
second one is before the instructions that copy the variable length
parameters into f's frame.  Both directives do locate instructions
that are in the prologue.

Unfortunately, GDB uses an heuristic that basically considers that the
first opcode of the line program that increments the line register
(even with an increment of zero) marks the end of the prologue.

Effectively, setting a breakpoint to the beginning of f (e.g, break
f) won't work anymore when we emit this because GDB would then try to
set the breakpoint inside the prologue.

This patch does two things.

First, it avoids emitting two consecutive .loc that are identical.
Strictly speaking that should fix this issue in this particular case.

Second, it emits a '.loc filenum linenum 0 prologue_end' directive
on the first instruction that doesn't belong to the prologue (i.e,
when the end_prologue debug hook is called).  This sets the
prologue_end register in the .debug_line state machine to true.  After
discussing this with Jan (in CC), it appears that setting the
prologue_end register to true will help GDB to drop the somewhat
non-reliable heuristic it uses to detect the end of the prologue.

I have noticed that the end_prologue debug hook was being called, but
the dwarf back-end hasn't implemented it (on non-vms platforms).  Is
there a particular reason for not implementing this?  Also, I have
noticed that the patch causes the prologue_end directive to be emitted
even when we compile with optimizations.  I am not sure how right that
would be.

Tested on x86_64-unknown-linux-gnu against trunk.

-- 
Dodji

From dc3dea429f1471540867a0b7e694a60494062ac0 Mon Sep 17 00:00:00 2001
From: Dodji Seketeli do...@redhat.com
Date: Tue, 29 Mar 2011 16:56:20 +0200
Subject: [PATCH] PR debug/47471 (set prologue_end in .debug_line)

gcc/
* dwarf2out.c (output_source_line_asm_info):  Split out of
dwarf2out_source_line.  Add a new is_prologue_end parameter.
Avoid emitting redundant consecutive .loc asm directives.
(dwarf2out_source_line):  Use output_source_line_asm.
(dwarf2out_end_prologue): New function.
(dwarf2_debug_hooks-end_prologue): Set to dwarf2out_end_prologue.

gcc/testsuite/
* gcc.dg/debug/dwarf2/line-prog-1.c: New test.
---
 gcc/dwarf2out.c |   95 +--
 1 files changed, 78 insertions(+), 17 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index efd30ea..6f8285c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -94,6 +94,8 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-flow.h
 #include cfglayout.h
 
+static void output_source_line_asm_info (unsigned int, const char *,
+int, bool, bool);
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx last_var_location_insn;
 
@@ -465,6 +467,7 @@ static void output_call_frame_info (int);
 static void dwarf2out_note_section_used (void);
 static bool clobbers_queued_reg_save (const_rtx);
 static void dwarf2out_frame_debug_expr (rtx, const char *);
+static void dwarf2out_end_prologue (unsigned int, const char*);
 
 /* Support for complex CFA locations.  */
 static void output_cfa_loc (dw_cfi_ref, int);
@@ -4125,6 +4128,21 @@ dwarf2out_begin_prologue (unsigned int line 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* Implementation of the gcc_debug_hooks::end_prologue hook.
+
+   If the underlying assembler supports it, emit a .loc asm directive
+   with a 'end_prologue' argument, effectively marking the point where
+   debugger should set a breakpoint when requested to set one on a
+   function.  */
+
+static void
+dwarf2out_end_prologue (unsigned int line, const char *filename)
+{
+  output_source_line_asm_info (line, filename, 0,
+  /*is_stmt=*/true,
+  /*is_prologue_end*/true);
+}
+
 /* Output a marker (i.e. a label) for the end of the generated code
for a function prologue.  This gets called *after* the prologue code has
been generated.  */
@@ -5767,7 +5785,7 @@ const struct gcc_debug_hooks dwarf2_debug_hooks =
   dwarf2out_vms_end_prologue,
   dwarf2out_vms_begin_epilogue,
 #else
-  debug_nothing_int_charstar,
+  dwarf2out_end_prologue,
   debug_nothing_int_charstar,
 #endif
   dwarf2out_end_epilogue,
@@ -22129,14 +22147,22 @@ dwarf2out_begin_function (tree fun)
 
 /* Output a label to mark the beginning of a source code line entry
and record information relating to this source line, in
-   'line_info_table' for later output of the .debug_line section.  */
+   'line_info_table' for later output

Merge mainline to gccgo branch again

2011-03-30 Thread Ian Lance Taylor

I merged mainline revision 171737 onto the gccgo branch.

Ian

C++ PATCH for c++/48369 (ICE with isnan)

2011-03-30 Thread Jason Merrill


A couple of missed cases.

Tested x86_64-pc-linux-gnu, applying to trunk and 4.6.
commit cea17025eb232f3931ce34e16dedff7fd42d2478
Author: Jason Merrill ja...@redhat.com
Date:   Wed Mar 30 15:17:27 2011 -0400

PR c++/48369
* semantics.c (potential_constant_expression_1): Handle
UNORDERED_EXPR and ORDERED_EXPR.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index b88e190..48dd4ee 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7741,6 +7741,8 @@ potential_constant_expression_1 (tree t, bool want_rval, 
tsubst_flags_t flags)
 case BIT_XOR_EXPR:
 case BIT_AND_EXPR:
 case TRUTH_XOR_EXPR:
+case UNORDERED_EXPR:
+case ORDERED_EXPR:
 case UNLT_EXPR:
 case UNLE_EXPR:
 case UNGT_EXPR:
diff --git a/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C 
b/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C
new file mode 100644
index 000..40d07e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C
@@ -0,0 +1,9 @@
+// PR c++/48369
+// { dg-options -std=gnu++0x }
+
+extern C int isnan (double);
+
+void f(double d)
+{
+bool b = isnan(d);
+}

[PATCH] Fix VTA updating in the combiner (PR debug/48343)

2011-03-30 Thread Jakub Jelinek

Hi!

We ICE on the attached testcase, because combiner changes mode of
a pseudo (which was set just once and used once plus in debug insns)
from SImode to QImode, but the uses in debug insns aren't adjusted.

Combiner has code to adjust this too (propagate_for_debug), but only updates
debug insns between i2 and i3 (resp. i3 and undobuf.other_insn).
The problem on the testcase is that this is a retry, so first
try_combine with a later i3 calls propagate_for_debug and changes debug
insns before that later i3, then returns an earlier insn that should be
retried and we stop adjusting debug insns at that earlier i3.  Unfortunately
as later debug insns have been already updated earlier, they need to be
adjusted too.

The following patch fixes that by always stopping on the latest i3 that has
been successfully combined into in the current bb, bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk/4.6?

2011-03-30  Jakub Jelinek  ja...@redhat.com

PR debug/48343
* combine.c (last_combined_insn): New variable.
(combine_instructions): Clear it before processing each bb
and after last bb.
(try_combine): Set it to i3 if i3 is after it.
Use it as the last argument to propagate_for_debug instead
of i3.
(distribute_notes): Assert SET_INSN_DELETE is not called
on last_combined_insn.

* gcc.dg/torture/pr48343.c: New test.

--- gcc/combine.c.jj2011-03-23 09:34:40.0 +0100
+++ gcc/combine.c   2011-03-30 17:39:09.0 +0200
@@ -337,6 +337,9 @@ static enum machine_mode nonzero_bits_mo
 
 static int nonzero_sign_valid;
 
+/* Last insn on which try_combine has been called in the current BB.  */
+
+static rtx last_combined_insn;
 
 /* Record one modification to rtl structure
to be undone by storing old_contents into *where.  */
@@ -1168,6 +1171,7 @@ combine_instructions (rtx f, unsigned in
  || single_pred (this_basic_block) != last_bb)
label_tick_ebb_start = label_tick;
   last_bb = this_basic_block;
+  last_combined_insn = NULL_RTX;
 
   rtl_profile_for_bb (this_basic_block);
   for (insn = BB_HEAD (this_basic_block);
@@ -1379,6 +1383,7 @@ combine_instructions (rtx f, unsigned in
}
 }
 
+  last_combined_insn = NULL_RTX;
   default_rtl_profile ();
   clear_log_links ();
   clear_bb_flags ();
@@ -3830,6 +3835,10 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   return 0;
 }
 
+  if (last_combined_insn == NULL_RTX
+  || DF_INSN_LUID (last_combined_insn)  DF_INSN_LUID (i3))
+last_combined_insn = i3;
+
   if (MAY_HAVE_DEBUG_INSNS)
 {
   struct undo *undo;
@@ -3851,7 +3860,7 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   i2src while its original mode is temporarily
   restored, and then clear i2scratch so that we don't
   do it again later.  */
-   propagate_for_debug (i2, i3, reg, i2src);
+   propagate_for_debug (i2, last_combined_insn, reg, i2src);
i2scratch = false;
/* Put back the new mode.  */
adjust_reg_mode (reg, new_mode);
@@ -3859,18 +3868,17 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
else
  {
rtx tempreg = gen_raw_REG (old_mode, REGNO (reg));
-   rtx first, last;
+   rtx first;
 
if (reg == i2dest)
- {
-   first = i2;
-   last = i3;
- }
+ first = i2;
else
  {
first = i3;
-   last = undobuf.other_insn;
-   gcc_assert (last);
+   gcc_assert (undobuf.other_insn);
+   if (DF_INSN_LUID (undobuf.other_insn)
+DF_INSN_LUID (last_combined_insn))
+ last_combined_insn = undobuf.other_insn;
  }
 
/* We're dealing with a reg that changed mode but not
@@ -3882,9 +3890,9 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   with this copy we have created; then, replace the
   copy with the SUBREG of the original shared reg,
   once again changed to the new mode.  */
-   propagate_for_debug (first, last, reg, tempreg);
+   propagate_for_debug (first, last_combined_insn, reg, tempreg);
adjust_reg_mode (reg, new_mode);
-   propagate_for_debug (first, last, tempreg,
+   propagate_for_debug (first, last_combined_insn, tempreg,
 lowpart_subreg (old_mode, reg, new_mode));
  }
  }
@@ -4099,14 +4107,14 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
 if (newi2pat)
   {
if (MAY_HAVE_DEBUG_INSNS  i2scratch)
- propagate_for_debug (i2, i3, i2dest, i2src);
+ propagate_for_debug (i2, last_combined_insn, i2dest, i2src);

[RFC PATCH, go]: Port to ALPHA arch

2011-03-30 Thread Uros Bizjak

Hello!

Attached ports go to ALPHA architecture.

There are however several problems with the build:

a) Bootstrap compare failure in the gcc/go directory due to binutils
bug [1], fixed in latest binutils SVN, use --disable-bootstrap

b) alpha doesn't define struct user_regs_struct from which type
PtraceRegs is derived. I have manually created PtraceRegs from
pt_regs structure and patched generated libgo/sysinfo.go in build
directory after the build broke. However - the comment from sys/user.h
says that this file is for GDB and GDB only...

c) some weird issue with double definition of const _SOCK_NONBLOCK
in gen-sysinfo.go and consequently sysinfo.go.

The test results from make -k check in libgo directory are quite encouraging:

PASS: asn1
PASS: big
PASS: bufio
PASS: bytes
FAIL: cmath
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  1637 Aborted
   ./a.out $@
PASS: ebnf
PASS: exec
PASS: expvar
PASS: flag
FAIL: fmt
mallocs per Sprintf(): 1
mallocs per Sprintf(xxx): 1
mallocs per Sprintf(%x): 3
mallocs per Sprintf(%x %x): 4
panic: runtime error: integer divide by zero or floating point error [recovered]
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  2309 Aborted
   ./a.out $@
PASS: gob
PASS: html
FAIL: http
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  3133 Segmentation faul
t  ./a.out $@
PASS: io
PASS: json
PASS: log
FAIL: math
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  4321 Aborted
   ./a.out $@
PASS: mime
PASS: netchan
PASS: os
PASS: patch
PASS: path
PASS: rand
PASS: reflect
PASS: regexp
FAIL: rpc
2011/03/30 21:31:20 Test RPC server listening on 127.0.0.1:47781
2011/03/30 21:31:20 Test HTTP RPC server listening on 127.0.0.1:59606
2011/03/30 21:31:20 rpc.Serve: accept:accept tcp 127.0.0.1:47781: Resource tempo
rarily unavailable
FAIL: runtime
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  5980 Aborted
   ./a.out $@
PASS: scanner
PASS: smtp
PASS: sort
FAIL: strconv
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  6561 Aborted
   ./a.out $@
PASS: strings
PASS: sync
PASS: tabwriter
PASS: template
PASS: time
PASS: try
PASS: unicode
PASS: utf16
PASS: utf8
FAIL: websocket
2011/03/30 21:32:19 Test WebSocket server listening on 127.0.0.1:33252
--- FAIL: websocket.TestEcho (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestEchoDraft75 (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestWithQuery (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestWithProtocol (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestHTTP (0.0 seconds)
Get: error http.URLError{Op:Get, URL:http://127.0.0.1:33252/echo;, E
rror:(*net.OpError)(0xf840026d00)}
--- FAIL: websocket.TestHTTPDraft75 (0.0 seconds)
Get: error http.URLError{Op:Get, URL:http://127.0.0.1:33252/echoDraf
t75, Error:(*net.OpError)(0xf840026c40)}
panic: Dial failed: websocket.Dial ws://127.0.0.1:33252/echo: dial tcp 127.0.0.1
:33252: Connection refused
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  8493 Aborted
   ./a.out $@
PASS: xml
PASS: archive/tar
PASS: archive/zip
PASS: compress/bzip2
FAIL: compress/flate
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  9194 Segmentation faul
t  ./a.out $@
PASS: compress/gzip
PASS: compress/lzw
PASS: compress/zlib
PASS: container/heap
PASS: container/list
PASS: container/ring
PASS: container/vector
PASS: crypto/aes
PASS: crypto/block
PASS: crypto/blowfish
PASS: crypto/cipher
PASS: crypto/dsa
PASS: crypto/ecdsa
PASS: crypto/elliptic
PASS: crypto/hmac
PASS: crypto/md4
PASS: crypto/md5
PASS: crypto/ocsp
PASS: crypto/openpgp
PASS: crypto/rand
PASS: crypto/rc4
PASS: crypto/ripemd160
PASS: crypto/rsa
PASS: crypto/sha1
PASS: crypto/sha256
PASS: crypto/sha512
PASS: crypto/subtle
PASS: crypto/tls
PASS: crypto/twofish
PASS: crypto/x509
PASS: crypto/xtea
PASS: crypto/openpgp/armor
PASS: crypto/openpgp/packet
PASS: crypto/openpgp/s2k
PASS: debug/dwarf
PASS: debug/elf
PASS: debug/macho
PASS: debug/pe
PASS: encoding/ascii85
PASS: encoding/base32
PASS: encoding/base64
PASS: encoding/binary
PASS: encoding/git85
PASS: encoding/hex
PASS: encoding/line
PASS: encoding/pem
PASS: exp/datafmt
PASS: exp/draw
PASS: exp/eval
PASS: go/parser
PASS: go/printer
PASS: go/scanner
PASS: go/token
PASS: go/typechecker
PASS: hash/adler32
PASS: hash/crc32
PASS: hash/crc64
PASS: hash/fnv
PASS: http/cgi
PASS: image/png
PASS: index/suffixarray
PASS: io/ioutil
PASS: mime/multipart
PASS: net/textproto
PASS:

Re: [4.7] Make ARM -mhard-float and -msoft-float into proper -mfloat-abi= aliases

2011-03-30 Thread Joseph S. Myers

On Wed, 2 Mar 2011, Richard Earnshaw wrote:

 Could you remove the documentation entries for the hard/soft-float
 aliases please?  They're really only there for legacy reasons.

Is this the change you want here?

2011-03-30  Joseph Myers  jos...@codesourcery.com

* config/arm/arm.opt (mhard-float, msoft-float): Mark
Undocumented.  Remove help text.
* doc/invoke.texi (ARM Options): Don't document -msoft-float and
-mhard-float.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 171745)
+++ doc/invoke.texi (working copy)
@@ -454,7 +454,7 @@
 -mapcs-reentrant  -mno-apcs-reentrant @gol
 -msched-prolog  -mno-sched-prolog @gol
 -mlittle-endian  -mbig-endian  -mwords-little-endian @gol
--mfloat-abi=@var{name}  -msoft-float  -mhard-float  -mfpe @gol
+-mfloat-abi=@var{name}  -mfpe @gol
 -mfp16-format=@var{name}
 -mthumb-interwork  -mno-thumb-interwork @gol
 -mcpu=@var{name}  -march=@var{name}  -mfpu=@var{name}  @gol
@@ -10079,14 +10079,6 @@
 compile your entire program with the same ABI, and link with a
 compatible set of libraries.
 
-@item -mhard-float
-@opindex mhard-float
-Equivalent to @option{-mfloat-abi=hard}.
-
-@item -msoft-float
-@opindex msoft-float
-Equivalent to @option{-mfloat-abi=soft}.
-
 @item -mlittle-endian
 @opindex mlittle-endian
 Generate code for a processor running in little-endian mode.  This is
Index: config/arm/arm.opt
===
--- config/arm/arm.opt  (revision 171745)
+++ config/arm/arm.opt  (working copy)
@@ -94,8 +94,7 @@
 Specify the name of the target floating point hardware/format
 
 mhard-float
-Target RejectNegative Alias(mfloat-abi=, hard)
-Alias for -mfloat-abi=hard
+Target RejectNegative Alias(mfloat-abi=, hard) Undocumented
 
 mlittle-endian
 Target Report RejectNegative InverseMask(BIG_END)
@@ -122,8 +121,7 @@
 Do not load the PIC register in function prologues
 
 msoft-float
-Target RejectNegative Alias(mfloat-abi=, soft)
-Alias for -mfloat-abi=soft
+Target RejectNegative Alias(mfloat-abi=, soft) Undocumented
 
 mstructure-size-boundary=
 Target RejectNegative Joined Var(structure_size_string)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-03-30 Thread Uros Bizjak

On Wed, Mar 30, 2011 at 9:58 PM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Attached ports go to ALPHA architecture.

 There are however several problems with the build:

 a) Bootstrap compare failure in the gcc/go directory due to binutils
 bug [1], fixed in latest binutils SVN, use --disable-bootstrap

 b) alpha doesn't define struct user_regs_struct from which type
 PtraceRegs is derived. I have manually created PtraceRegs from
 pt_regs structure and patched generated libgo/sysinfo.go in build
 directory after the build broke. However - the comment from sys/user.h
 says that this file is for GDB and GDB only...

 c) some weird issue with double definition of const _SOCK_NONBLOCK
 in gen-sysinfo.go and consequently sysinfo.go.

d) The definition of type Stat_t struct also includes additional
struct and this confuses compilation.

Attached is a fixup of sysinfo.go that is neded for successful build.
Apply this fixup after build breaks and restart build.

Uros.


sysinfo.go.diff
Description: Binary data

Re: [4.7] Make ARM -mhard-float and -msoft-float into proper -mfloat-abi= aliases

2011-03-30 Thread Richard Earnshaw


On 30 Mar 2011, at 21:11, Joseph S. Myers jos...@codesourcery.com wrote:

 On Wed, 2 Mar 2011, Richard Earnshaw wrote:
 
 Could you remove the documentation entries for the hard/soft-float
 aliases please?  They're really only there for legacy reasons.
 
 Is this the change you want here?
 
 2011-03-30  Joseph Myers  jos...@codesourcery.com
 
* config/arm/arm.opt (mhard-float, msoft-float): Mark
Undocumented.  Remove help text.
* doc/invoke.texi (ARM Options): Don't document -msoft-float and
-mhard-float.
 

Yes, that's great.  Thanks.

R.

Re: [libgo] Improve Solaris 2/SPARC support

2011-03-30 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 2011-03-24  Rainer Orth  r...@cebitec.uni-bielefeld.de

   go:
   * go.test/go-test.exp (go-set-goarch): Use sparc64 for 64-bit SPARC.

 diff -r de1b3baf021b gcc/testsuite/go.test/go-test.exp
 --- a/gcc/testsuite/go.test/go-test.exp   Thu Mar 24 13:19:30 2011 +0100
 +++ b/gcc/testsuite/go.test/go-test.exp   Thu Mar 24 13:22:43 2011 +0100
 @@ -129,7 +129,7 @@
   if [check_effective_target_ilp32] {
   set goarch sparc
   } else {
 - set goarch sparcv9
 + set goarch sparc64
   }
   }
   default {

Thanks, I just committed this patch to mainline.

I'm working on the other ones.

Ian

Go testsuite patch committed: Fix env.go for other architectures

2011-03-30 Thread Ian Lance Taylor

I brought over this patch to the Go testsuite from the master
repository.  This avoids an explicit list of supported architectures, so
that we don't have to add each gcc architecture to the list.  Ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

Index: gcc/testsuite/go.test/test/env.go
===
--- gcc/testsuite/go.test/test/env.go	(revision 171697)
+++ gcc/testsuite/go.test/test/env.go	(working copy)
@@ -6,7 +6,10 @@
 
 package main
 
-import os os
+import (
+	os
+	runtime
+)
 
 func main() {
 	ga, e0 := os.Getenverror(GOARCH)
@@ -14,8 +17,8 @@ func main() {
 		print($GOARCH: , e0.String(), \n)
 		os.Exit(1)
 	}
-	if ga != amd64  ga != 386  ga != arm {
-		print($GOARCH=, ga, \n)
+	if ga != runtime.GOARCH {
+		print($GOARCH=, ga, != runtime.GOARCH=, runtime.GOARCH, \n)
 		os.Exit(1)
 	}
 	xxx, e1 := os.Getenverror(DOES_NOT_EXIST)

[pph] Write language specific data (issue4331046)

2011-03-30 Thread Diego Novillo


This patch implements the writing of DECL_LANG_SPECIFIC fields.  It's
needed to write global_namespace.  The reading part is still
incomplete, but I wanted to flush this out before it got too big.

The main changes:

- We can now write out references to TEMPLATE_DECLs.  They are stored
  in the same index table as all other DECLs.  To implement this, I
  added a streamer hook that's called from lto_output_tree_ref.  When
  it does not recognize the decl, it calls the hook which tells it
  whether the decl can be added to the index table.

- We stream out everything in DECL_LANG_SPECIFIC, which in C++ is
  quite a bit.  This will cover global_namespace since it goes 
  through all the data stored for NAMESPACE_DECLs.

- Since DECL_LANG_SPECIFIC can contains tokens, we need to save token
  caches.  Since we are inside a streamer callback, we do not have a
  pointer to a PPH stream, so I've added a pretty hacky way of getting
  back to it.  When we initially open the PPH stream and associate
  with it an output_block object, I take an unused field from
  output_block and use it as my pointer back to the PPH stream.  This
  is awful, but it works for now.  I'll clean this up in a future
  patch.

The next patch will add the reader side, which will allow me to
reconstruct global_namespace.  This will uncover what other tree types
we need to handle.

With this I'm down to two failures in pph.exp.  No changes for
pth.exp.


ChangeLog.pph
2011-03-30  Diego Novillo  dnovi...@google.com

* lto-streamer-in.c (lto_input_chain): Make extern.
* lto-streamer-out.c (lto_output_tree_ref): Call
streamer_hook-indexable_with_decls_p, if it exists.  If it
returns true, emit a reference to EXPR in the VAR_DECL index.
(lto_output_chain): Make extern.
(lto_output_tree_pointers): Move checks for invalid gimple trees ...
* lto-streamer.c (lto_is_streamable): ... here.
* lto-streamer.h (lto_streamer_hooks): Add
indexable_with_decls_p.
(lto_output_chain): Declare.
(lto_input_chain): Declare.


c-family/ChangeLog.pph
2011-03-30  Diego Novillo  dnovi...@google.com

* c-family/c.opt (fpph-fmt): Remove.  Update all users.


cp/ChangeLog.pph
2011-03-30  Diego Novillo  dnovi...@google.com

* cp-tree.h (struct language_function): Add prefix 'x_' to
fields returns_value, returns_null, returns_abnormally,
in_function_try_handler, in_base_initializer.  Update all
users.
* pph-streamer.c (pph_stream_write_ld_base): New.
(pph_stream_write_ld_min): New.
(pph_stream_write_tree_vec): New.
(pph_stream_write_cxx_binding_1): New.
(pph_stream_write_cxx_binding): New.
(pph_stream_write_class_binding): New.
(pph_stream_write_label_binding): New.
(pph_stream_write_binding_level): New.
(pph_stream_write_c_language_function): New.
(pph_stream_write_language_function): New.
(pph_stream_write_ld_fn): New.
(pph_stream_write_ld_ns): New.
(pph_stream_write_ld_parm): New.
(pph_stream_write_lang_specific_data): New.
(pph_indexable_with_decls_p): New.
(pph_stream_hooks_init): Initialize h-indexable_with_decls_p
with pph_indexable_with_decls_p.
(pph_stream_begin_section): Do not free BLOCK.
* pph-streamer.h (pth_save_token_cache): Declare.
(pph_get_ob_stream): New.
(pph_set_ob_stream): New.
* pph.c (pth_save_token_cache): New.
(pph_print_macro_defs_before): Remove.
(pph_print_macro_defs_after): Remove.
(pph_write_namespace): Remove.
(pph_write_format): Remove.
(pph_write_print): Remove.
(pph_write_dump): Remove.
(pph_write_symbol): Remove.
(declvisitor): Remove.
(pph_write_namespace_1): Remove.
(pph_write_namespace): Remove.
(pph_write_file_contents): Rename from pph_write_file_object.
Output global_namespace.
(pph_write_file): Call it.
(pph_write_file_summary): Remove.
(pph_read_file_contents): Rename from pph_file_read_object.
(pph_read_file): Rename from pph_file_read.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 54e7461..12d4f06 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -933,10 +933,6 @@ fpph-decls=
 C++ Joined RejectNegative UInteger Var(flag_pph_decls)
 -fpph-decls=N   Enable declaration identifier output at level N from PPH 
support
 
-fpph-fmt=
-C++ Joined RejectNegative UInteger Var(flag_pph_fmt)
--fpph-fmt=N   Output format is (0) normal (1) pretty summary (2) dump
-
 fpph-hdr=
 C++ ObjC++ Joined MissingArgError(missing filename after %qs)
 -fpph-hdr=base-name   A mapping from base-name.h to base-name.pph
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6ef6e6e..aedcab2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -1012,11 +1012,11 @@ struct GTY(()) language_function {
   tree

[google] Disable guality tests (issue4328047)

2011-03-30 Thread Diego Novillo


The guality tests are failing/passing intermittently.  This confuses
our builders which expect 0 failures and 0 xpasses.  When they work,
it's not that they are working but dejagnu failed to launch gdb:

-
spawn [open ...]
gdb: took too long to attach
testcase [...]/gcc/testsuite/gcc.dg/guality/guality.exp
completed in 17 seconds
-

And when they do execute, they show a bunch of XPASS results:

-
XPASS: gcc.dg/guality/example.c  -O0  execution test
XPASS: gcc.dg/guality/example.c  -O1  execution test
XPASS: gcc.dg/guality/example.c  -O2  execution test
XPASS: gcc.dg/guality/example.c  -Os  execution test
XPASS: gcc.dg/guality/example.c  -O2 -flto -flto-partition=none execution test
XPASS: gcc.dg/guality/example.c  -O2 -flto  execution test
XPASS: gcc.dg/guality/guality.c  -O0  execution test
XPASS: gcc.dg/guality/guality.c  -O1  execution test
XPASS: gcc.dg/guality/guality.c  -O2  execution test
XPASS: gcc.dg/guality/guality.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/guality.c  -O3 -g  execution test
XPASS: gcc.dg/guality/guality.c  -Os  execution test
XPASS: gcc.dg/guality/guality.c  -O2 -flto -flto-partition=none execution test
XPASS: gcc.dg/guality/guality.c  -O2 -flto  execution test
XPASS: gcc.dg/guality/inline-params.c  -O2  execution test
XPASS: gcc.dg/guality/inline-params.c  -O3 -fomit-frame-pointer execution test
XPASS: gcc.dg/guality/inline-params.c  -O3 -g  execution test
XPASS: gcc.dg/guality/inline-params.c  -Os  execution test
XPASS: gcc.dg/guality/pr41353-1.c  -O0  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O1  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O3 -fomit-frame-pointer  line 28 j == 28 + 
37
XPASS: gcc.dg/guality/pr41353-1.c  -O3 -g  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -Os  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2 -flto -flto-partition=none line 28 j == 
28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2 -flto  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41447-1.c  -O0  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O1  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O2  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O3 -g  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -Os  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O0  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O1  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O2  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O3 -g  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -Os  execution test
XPASS: gcc.dg/guality/sra-1.c  -O0  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O1  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O3 -fomit-frame-pointer  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O3 -g  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -Os  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2 -flto -flto-partition=none  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2 -flto  line 43 a.j == 14
-

Perhaps it's just a matter of removing the XFAILs, but I have seen
spurious FAILs in the past and I'm not sure why they are passing if
they were meant to fail.

Alex, are the above tests supposed to fail or not?  This is on x86_64
with dejagnu combinations unix/-m32 and unix/-m64.

For now, I am disabling these tests in the google branches.

Tested on x86_64.  Committed to google/integration.  Will merge into
the other google branches as well.


2011-03-30  Diego Novillo  dnovi...@google.com

* g++.dg/guality/guality.exp: Disable.
* gcc.dg/guality/guality.exp: Disable.
* gfortran.dg/guality/guality.exp: Disable.

diff --git a/gcc/testsuite/g++.dg/guality/guality.exp 
b/gcc/testsuite/g++.dg/guality/guality.exp
index 9a17850..a07a628 100644
--- a/gcc/testsuite/g++.dg/guality/guality.exp
+++ b/gcc/testsuite/g++.dg/guality/guality.exp
@@ -1,5 +1,8 @@
 # This harness is for tests that should be run at all optimisation levels.
 
+# Disable everywhere.  These tests are very flaky.
+return
+
 load_lib g++-dg.exp
 load_lib gcc-gdb-test.exp
 
diff --git a/gcc/testsuite/gcc.dg/guality/guality.exp 
b/gcc/testsuite/gcc.dg/guality/guality.exp
index 49e2ac5..39bf4e3 100644
--- a/gcc/testsuite/gcc.dg/guality/guality.exp
+++ b/gcc/testsuite/gcc.dg/guality/guality.exp
@@ -1,5 +1,8 @@
 # This harness is for tests that should be run at all optimisation levels.
 
+# Disable everywhere.  These tests are very flaky.
+return
+

Re: [libgo] Improve Solaris 2/SPARC support

2011-03-30 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 * go-test.exp wasn't updated for the change from sparcv9 to sparc64.
   While I still don't agree with the new name, at least the two should
   be consistent.

I committed this earlier today.

 * env.go needs to accept sparc and sparc64.

I fixed this in a different way.

 * Just like 32-bit Solaris 2/x86, 32-bit Solaris 2/SPARC needs to use
   the largefile variants of several functions.  I've not introduced a
   new LIBGO_IS_SOLARIS32 conditional for that, but perhaps one should?

I just committed this one, as appended.

Thanks for sending them.

Ian

diff -r 50a941f17e57 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 10:36:43 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 14:42:46 2011 -0700
@@ -689,8 +689,12 @@
 if LIBGO_IS_386
 go_os_dir_file = go/os/dir_largefile.go
 else
+if LIBGO_IS_SPARC
+go_os_dir_file = go/os/dir_largefile.go
+else
 go_os_dir_file = go/os/dir_regfile.go
 endif
+endif
 else
 if LIBGO_IS_LINUX
 go_os_dir_file = go/os/dir_largefile.go
@@ -1219,16 +1223,21 @@
 syscall_stat_file = syscalls/sysfile_stat_largefile.go
 else # !LIBGO_IS_LINUX
 if LIBGO_IS_SOLARIS
-# FIXME: Same for sparc vs. sparc64.  Introduce new/additional conditional?
 if LIBGO_IS_386
-# Use lseek64 on 386 Solaris.
+# Use lseek64 on 32-bit Solaris/x86.
 syscall_filesize_file = syscalls/sysfile_largefile.go
 syscall_stat_file = syscalls/sysfile_stat_largefile.go
-else # !LIBGO_IS_LINUX  LIBGO_IS_SOLARIS  !LIBGO_IS_386
-# Use lseek on amd64 Solaris.
+else # !LIBGO_IS_386
+if LIBGO_IS_SPARC
+# Use lseek64 on 32-bit Solaris/SPARC.
+syscall_filesize_file = syscalls/sysfile_largefile.go
+syscall_stat_file = syscalls/sysfile_stat_largefile.go
+else # !LIBGO_IS_386  !LIBGO_IS_SPARC
+# Use lseek on 64-bit Solaris.
 syscall_filesize_file = syscalls/sysfile_regfile.go
 syscall_stat_file = syscalls/sysfile_stat_regfile.go
-endif # !LIBGO_IS_386
+endif # !LIBGO_IS_386  !LIBGO_IS_SPARC
+endif # !LIBGO_IS_SOLARIS
 else # !LIBGO_IS_LINUX  !LIBGO_IS_SOLARIS
 # Use lseek by default.
 syscall_filesize_file = syscalls/sysfile_regfile.go

Re: [libgo] Provide strerror_r replacement (PR go/47515)

2011-03-30 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Apart from the lack of wait4, libgo on IRIX 6.5 contained an undefined
 reference to strerror_r.  This patch provides a replacement, based on
 gnulib/lib/strerror_r.c, but massively simplified.

I addressed this in a different way, with a new variant of
syscall.Errstr written in Go.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r ad3d14acbce1 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 14:45:29 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 15:31:01 2011 -0700
@@ -1264,7 +1264,11 @@
 if LIBGO_IS_RTEMS
 syscall_errstr_file = syscalls/errstr_rtems.go
 else
+if HAVE_STRERROR_R
 syscall_errstr_file = syscalls/errstr.go
+else
+syscall_errstr_file = syscalls/errstr_nor.go
+endif
 endif
 
 # Declare libc_strerror_r which is the Go name for strerror_r.
@@ -1273,7 +1277,7 @@
 syscall_errstr_decl_file = syscalls/errstr_decl_rtems.go
 else
 if LIBGO_IS_LINUX
-# In Linux the POSIX strerror_r is called __xpg_strerror_r.
+# On GNU/Linux the POSIX strerror_r is called __xpg_strerror_r.
 syscall_errstr_decl_file = syscalls/errstr_decl_linux.go
 else
 # On other systems we hope strerror_r is just strerror_r.
diff -r ad3d14acbce1 libgo/configure.ac
--- a/libgo/configure.ac	Wed Mar 30 14:45:29 2011 -0700
+++ b/libgo/configure.ac	Wed Mar 30 15:31:01 2011 -0700
@@ -380,7 +380,9 @@
 
 AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h)
 AM_CONDITIONAL(HAVE_SYS_MMAN_H, test $ac_cv_header_sys_mman_h = yes)
-AC_CHECK_FUNCS(srandom random strsignal)
+
+AC_CHECK_FUNCS(srandom random strerror_r strsignal)
+AM_CONDITIONAL(HAVE_STRERROR_R, test $ac_cv_func_strerror_r = yes)
 
 AC_CACHE_CHECK([for __sync_bool_compare_and_swap_4],
 [libgo_cv_func___sync_bool_compare_and_swap_4],
diff -r ad3d14acbce1 libgo/syscalls/errstr_nor.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/errstr_nor.go	Wed Mar 30 15:31:01 2011 -0700
@@ -0,0 +1,32 @@
+// errstr.go -- Error strings when there is no strerror_r.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+import (
+	sync
+	unsafe
+)
+
+func libc_strerror(int) *byte __asm__ (strerror)
+
+var errstr_lock sync.Mutex
+
+func Errstr(errno int) string {
+	errstr_lock.Lock()
+
+	bp := libc_strerror(errno)
+	b := (*[1000]byte)(unsafe.Pointer(bp))
+	i := 0
+	for b[i] != 0 {
+		i++
+	}
+	s := string(b[:i])
+
+	errstr_lock.Unlock()
+
+	return s
+}

[patch][cprop.c] Clean up hash table building

2011-03-30 Thread Steven Bosscher

Hi,

This is the first cleanup for cprop.c. These cleanups are only
possible now, thanks to splitting the CPROP code out from gcse.c

The first change is to not treat unfolded conditions as constant in
gcse_constant_p(). This never happens because:
- during hash table building any such condition should have been
simplified by any prior pass
- during cprop the expression table is not updated because the
patterns are unshared

This causes no changes in code generation for all cc1-i files on
x86_64-unknown-linux-gnu. I could add an assert to make sure this kind
of condition really never happens, but IMHO it's not worth it and the
bug would be elsewhere anyway.

The second change is to remove oprs_available_p(). After the
gcse_constant_p() cleanup, we do not have to traverse the whole
pattern of a SET candidate for CPROP, because we know that SET_DEST is
a REG and SET_SRC is a REG or a shareable constant. Therefore a simple
check to see if the registers involved were set in the block is
sufficient, and a regset can be used instead of the last_set table.
See reg_available_p().

The third change is to compute_hash_table_work(), which now traverses
the insns in each basic block just once (instead of twice), in reverse
order to record all registers set between BB_END and the current insn.
This change allows further simplify hash_scan_set() which now doesn't
have to look at INSN+1 anymore (saving another half insns stream
traversal by avoiding next_nonnote_nondebug_insn()).

Bootstrapped and tested on x86_64-unknown-linux-gnu. OK?

Ciao!
Steven
* cprop.c: Clean up hash table building.
(reg_avail_info): Remove.
(oprs_available_p): Remove.
(record_last_reg_set_info): Remove.
(record_last_set_info): Remove.
(reg_available_p): New function.
(gcse_constant_p): Do not treat unfolded conditions as constants.
(make_set_regs_unavailable): New function.
(hash_scan_set): Simplify with new reg_available_p.
(compute_hash_table_work): Traverse insns stream only once.
Do not compute reg_avail_info. Traverse insns in reverse order.
Record implicit sets after recording explicit sets from the block.

Index: cprop.c
===
*** cprop.c (revision 171627)
--- cprop.c (working copy)
*** static rtx *implicit_sets;
*** 118,124 
  
  /* Bitmap containing one bit for each register in the program.
 Used when performing GCSE to track which registers have been set since
!the start of the basic block.  */
  static regset reg_set_bitmap;
  
  /* Various variables for statistics gathering.  */
--- 118,124 
  
  /* Bitmap containing one bit for each register in the program.
 Used when performing GCSE to track which registers have been set since
!the start or end of the basic block while traversing that block.  */
  static regset reg_set_bitmap;
  
  /* Various variables for statistics gathering.  */
*** free_gcse_mem (void)
*** 183,261 
FREE_REG_SET (reg_set_bitmap);
  }
  
! struct reg_avail_info
! {
!   basic_block last_bb;
!   int last_set;
! };
! 
! static struct reg_avail_info *reg_avail_info;
! static basic_block current_bb;
! 
! /* Return nonzero if the operands of expression X are unchanged from
!INSN to the end of INSN's basic block.  */
  
  static int
! oprs_available_p (const_rtx x, const_rtx insn)
  {
!   int i, j;
!   enum rtx_code code;
!   const char *fmt;
! 
!   if (x == 0)
! return 1;
! 
!   code = GET_CODE (x);
!   switch (code)
! {
! case REG:
!   {
!   struct reg_avail_info *info = reg_avail_info[REGNO (x)];
! 
!   if (info-last_bb != current_bb)
! return 1;
!   return info-last_set  DF_INSN_LUID (insn);
!   }
! 
! case PRE_DEC:
! case PRE_INC:
! case POST_DEC:
! case POST_INC:
! case PRE_MODIFY:
! case POST_MODIFY:
!   return 0;
! 
! case PC:
! case CC0: /*FIXME*/
! case CONST:
! case CONST_INT:
! case CONST_DOUBLE:
! case CONST_FIXED:
! case CONST_VECTOR:
! case SYMBOL_REF:
! case LABEL_REF:
! case ADDR_VEC:
! case ADDR_DIFF_VEC:
!   return 1;
! 
! default:
!   break;
! }
! 
!   for (i = GET_RTX_LENGTH (code) - 1, fmt = GET_RTX_FORMAT (code); i = 0; 
i--)
! {
!   if (fmt[i] == 'e')
!   {
! if (! oprs_available_p (XEXP (x, i), insn))
!   return 0;
!   }
!   else if (fmt[i] == 'E')
!   for (j = 0; j  XVECLEN (x, i); j++)
! if (! oprs_available_p (XVECEXP (x, i, j), insn))
!   return 0;
! }
! 
!   return 1;
  }
  
  /* Hash a set of register REGNO.
--- 183,195 
FREE_REG_SET (reg_set_bitmap);
  }
  
! /* Return nonzero if register X is unchanged from INSN to the end
!of INSN's basic block.  */
  
  static int
! reg_available_p (const_rtx x, const_rtx insn ATTRIBUTE_UNUSED)
  {
!   return ! REGNO_REG_SET_P

Re: [libgo] Replace wait4 by waitpid (PR go/47515)

2011-03-30 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Currently, libgo uses wait4 unconditionally, which is missing on IRIX
 6.5.  Fortunately, the rusage * arg is used nowhere, so I've decide to
 replace wait4 by waitpid, which seems to be considerably more portable.

Thanks for the patch, but I wasn't entirely happy with the approach
because it removes the os.Waitmsg.Rusage field.  That field is
inherently system dependent but I would rather not get rid of it.  I
wrote this patch instead, which I hope will also solve the problem.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r a602eae61cf4 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 16:01:55 2011 -0700
@@ -1246,13 +1246,20 @@
 endif # !LIBGO_IS_LINUX
 
 
-# Define ForkExec, PtraceForkExec, Exec, and Wait4.
+# Define ForkExec, PtraceForkExec, and Exec.
 if LIBGO_IS_RTEMS
 syscall_exec_os_file = syscalls/exec_stubs.go
 else
 syscall_exec_os_file = syscalls/exec.go
 endif
 
+# Define Wait4.
+if HAVE_WAIT4
+syscall_wait_file = syscalls/wait4.go
+else
+syscall_wait_file = syscalls/waitpid.go
+endif
+
 # Define Sleep.
 if LIBGO_IS_RTEMS
 syscall_sleep_file = syscalls/sleep_rtems.go
@@ -1329,6 +1336,7 @@
 	$(syscall_errstr_decl_file) \
 	syscalls/exec_helpers.go \
 	$(syscall_exec_os_file) \
+	$(syscall_wait_file) \
 	$(syscall_filesize_file) \
 	$(syscall_stat_file) \
 	$(syscall_sleep_file) \
diff -r a602eae61cf4 libgo/configure.ac
--- a/libgo/configure.ac	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/configure.ac	Wed Mar 30 16:01:55 2011 -0700
@@ -381,8 +381,9 @@
 AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h)
 AM_CONDITIONAL(HAVE_SYS_MMAN_H, test $ac_cv_header_sys_mman_h = yes)
 
-AC_CHECK_FUNCS(srandom random strerror_r strsignal)
+AC_CHECK_FUNCS(srandom random strerror_r strsignal wait4)
 AM_CONDITIONAL(HAVE_STRERROR_R, test $ac_cv_func_strerror_r = yes)
+AM_CONDITIONAL(HAVE_WAIT4, test $ac_cv_func_wait4 = yes)
 
 AC_CACHE_CHECK([for __sync_bool_compare_and_swap_4],
 [libgo_cv_func___sync_bool_compare_and_swap_4],
diff -r a602eae61cf4 libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/mksysinfo.sh	Wed Mar 30 16:01:55 2011 -0700
@@ -377,6 +377,8 @@
 nrusage=$nrusage $field;
   done
   echo type Rusage struct {$nrusage }  ${OUT}
+else
+  echo type Rusage struct {}  ${OUT}
 fi
 
 # The utsname struct.
diff -r a602eae61cf4 libgo/syscalls/exec.go
--- a/libgo/syscalls/exec.go	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/syscalls/exec.go	Wed Mar 30 16:01:55 2011 -0700
@@ -17,7 +17,6 @@
 func libc_dup2(int, int) int __asm__ (dup2)
 func libc_execve(*byte, **byte, **byte) int __asm__ (execve)
 func libc_sysexit(int) __asm__ (_exit)
-func libc_wait4(Pid_t, *int, int, *Rusage) Pid_t __asm__ (wait4)
 
 // Fork, dup fd onto 0..len(fd), and exec(argv0, argvv, envv) in child.
 // If a dup or exec fails, write the errno int to pipe.
@@ -263,16 +262,3 @@
 	libc_execve(StringBytePtr(argv0), argv_arg[0], envv_arg[0])
 	return GetErrno()
 }
-
-func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
-	var status int
-	r := libc_wait4(Pid_t(pid), status, options, rusage)
-	wpid = int(r)
-	if r  0 {
-		errno = GetErrno()
-	}
-	if wstatus != nil {
-		*wstatus = WaitStatus(status)
-	}
-	return
-}
diff -r a602eae61cf4 libgo/syscalls/wait4.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/wait4.go	Wed Mar 30 16:01:55 2011 -0700
@@ -0,0 +1,22 @@
+// wait4.go -- Wait4 for systems with wait4.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+func libc_wait4(Pid_t, *int, int, *Rusage) Pid_t __asm__ (wait4)
+
+func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
+	var status int
+	r := libc_wait4(Pid_t(pid), status, options, rusage)
+	wpid = int(r)
+	if r  0 {
+		errno = GetErrno()
+	}
+	if wstatus != nil {
+		*wstatus = WaitStatus(status)
+	}
+	return
+}
diff -r a602eae61cf4 libgo/syscalls/waitpid.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/waitpid.go	Wed Mar 30 16:01:55 2011 -0700
@@ -0,0 +1,22 @@
+// waitpid.go -- Wait4 for systems without wait4, but with waitpid.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+func libc_waitpid(Pid_t, *int, int) Pid_t __asm__ (waitpid)
+
+func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
+	var status int
+	r := libc_waitpid(Pid_t(pid), status, options)
+	wpid = int(r)
+	if r  0 {
+		errno = GetErrno()
+	}
+	if wstatus != nil {
+		*wstatus = WaitStatus(status)
+	}
+	return
+}

[x86] Fix PR target/48142

2011-03-30 Thread Eric Botcazou

Hi,

this is a regression present for x86-64 on mainline and 4.6 branch with the 
options -Os -mpreferred-stack-boundary=5 -fstack-check -fno-omit-frame-pointer.
This improbable combination of options is necessary because you need to have 
stack checking + stack realignment + !ACCUMULATE_OUTGOING_ARGS.  In this case, 
the DW_CFA_GNU_args_size CFIs must be correct in spite of the frame pointer.

Tested on {i586,x86_64}-suse-linux, OK for mainline and 4.6 branch?


2011-03-30  Eric Botcazou  ebotca...@adacore.com

PR target/48142
* config/i386/i386.c (ix86_adjust_stack_and_probe): Differentiate
frame-related from frame-unrelated adjustments to the stack pointer.


2011-03-30  Eric Botcazou  ebotca...@adacore.com

* g++.dg/other/pr48142.C: New test.


-- 
Eric Botcazou
// PR target/48142
// Testcase by Zdenek Sojka zso...@seznam.cz

// { dg-do run { target i?86-*-* x86_64-*-* } }
// { dg-options -Os -mpreferred-stack-boundary=5 -fstack-check -fno-omit-frame-pointer }

int main()
{
  try { throw 0; }
  catch (...) {}
  return 0;
}
Index: config/i386/i386.c
===
--- config/i386/i386.c	(revision 171716)
+++ config/i386/i386.c	(working copy)
@@ -10006,7 +10006,7 @@ ix86_adjust_stack_and_probe (const HOST_
  probe that many bytes past the specified size to maintain a protection
  area at the botton of the stack.  */
   const int dope = 4 * UNITS_PER_WORD;
-  rtx size_rtx = GEN_INT (size);
+  rtx size_rtx = GEN_INT (size), last;
 
   /* See if we have a constant small number of probes to generate.  If so,
  that's the easy case.  The run-time loop is made up of 11 insns in the
@@ -10046,9 +10046,9 @@ ix86_adjust_stack_and_probe (const HOST_
   emit_stack_probe (stack_pointer_rtx);
 
   /* Adjust back to account for the additional first interval.  */
-  emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
-			  plus_constant (stack_pointer_rtx,
-	 PROBE_INTERVAL + dope)));
+  last = emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+ plus_constant (stack_pointer_rtx,
+		PROBE_INTERVAL + dope)));
 }
 
   /* Otherwise, do the same as above, but in a loop.  Note that we must be
@@ -10109,15 +10109,33 @@ ix86_adjust_stack_and_probe (const HOST_
 	}
 
   /* Adjust back to account for the additional first interval.  */
-  emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
-			  plus_constant (stack_pointer_rtx,
-	 PROBE_INTERVAL + dope)));
+  last = emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+ plus_constant (stack_pointer_rtx,
+		PROBE_INTERVAL + dope)));
 
   release_scratch_register_on_entry (sr);
 }
 
   gcc_assert (cfun-machine-fs.cfa_reg != stack_pointer_rtx);
-  cfun-machine-fs.sp_offset += size;
+
+  /* Even if the stack pointer isn't the CFA register, we need to correctly
+ describe the adjustments made to it, in particular differentiate the
+ frame-related ones from the frame-unrelated ones.  */
+  if (size  0)
+{
+  rtx expr = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
+  XVECEXP (expr, 0, 0)
+	= gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+		   plus_constant (stack_pointer_rtx, -size));
+  XVECEXP (expr, 0, 1)
+	= gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+		   plus_constant (stack_pointer_rtx,
+  PROBE_INTERVAL + dope + size));
+  add_reg_note (last, REG_FRAME_RELATED_EXPR, expr);
+  RTX_FRAME_RELATED_P (last) = 1;
+
+  cfun-machine-fs.sp_offset += size;
+}
 
   /* Make sure nothing is scheduled before we are done.  */
   emit_insn (gen_blockage ());

PR bootstrap/48371

2011-03-30 Thread Jeff Law

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sigh.  I changed a reg_equiv_address to reg_equiv_mem when I changed
from explicit VEC_index references to macros.  That caused all kinds of
interesting problems for ia32 :-)

I'm also fixing a comment typo that was introduced around the same time.

Installed as obvious.  Fixes 48371.

x86_64-unknown-linux-gnu bootstrapped and regression tested.  i686
bootstrap in progress (and probably won't finish until tomorrow...)


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNk8UuAAoJEBRtltQi2kC77goIAISRCD1qP9OA8KS1WPYtocFo
ZdxfOX0RA1tHZclEVPQbqkWaGxXW7jJbf1Wz8rFsMgQpeXv328NNDgfgmLTMhw6o
jxhuOjvNX7rSDYeKGjPBuU+K37co+oVaHi9x8X0ZhOkYUHG9u+hKLlLXUV9h5AGL
0RNY71nfZD4y7y4F6xAqFPGlnh8HCzXTaUbzTXXyEcTo3WH1rWO2HcZhL/5MJ2SR
aaoF0lbIC4GAm2kUXUYeSV1JYlzc7ZUZrnjZwKwFiK4UVcF90nJr6Ji7bGEbuRAN
3D0U7Lmda8tWZ0gzkhWlcwx4Ys0gBBoHJowDFXY9hvkff7FABa+raUsEkyO2rTc=
=dWZc
-END PGP SIGNATURE-
* reload1.c (reload): Fix botch in last change.
* reload.h (struct reload): Fix typo introduced in last change.

Index: reload1.c
===
*** reload1.c   (revision 171731)
--- reload1.c   (working copy)
*** reload (rtx first, int global)
*** 895,901 
  (REGNO (XEXP (XEXP (x, 0), 0))
  FIRST_PSEUDO_REGISTER)
  CONSTANT_P (XEXP (XEXP (x, 0), 1
! reg_equiv_mem (i) = XEXP (x, 0), reg_equiv_mem (i) = 0;
else
  {
/* Make a new stack slot.  Then indicate that something
--- 895,901 
  (REGNO (XEXP (XEXP (x, 0), 0))
  FIRST_PSEUDO_REGISTER)
  CONSTANT_P (XEXP (XEXP (x, 0), 1
! reg_equiv_address (i) = XEXP (x, 0), reg_equiv_mem (i) = 0;
else
  {
/* Make a new stack slot.  Then indicate that something
Index: reload.h
===
*** reload.h(revision 171731)
--- reload.h(working copy)
*** struct reload
*** 100,106 
int inc;
/* A reg for which reload_in is the equivalent.
   If reload_in is a symbol_ref which came from
!  reg_equiv_consant, then this is the pseudo
   which has that symbol_ref as equivalent.  */
rtx in_reg;
rtx out_reg;
--- 100,106 
int inc;
/* A reg for which reload_in is the equivalent.
   If reload_in is a symbol_ref which came from
!  reg_equiv_constant, then this is the pseudo
   which has that symbol_ref as equivalent.  */
rtx in_reg;
rtx out_reg;

Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread Jeff Law

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/30/11 10:51, Richard Sandiford wrote:
 Nice cleanup thanks.  Just noticed a couple of things:
 
 Jeff Law l...@redhat.com writes:
 *** struct reload
 *** 100,106 
 int inc;
 /* A reg for which reload_in is the equivalent.
If reload_in is a symbol_ref which came from
 !  reg_equiv_constant, then this is the pseudo
which has that symbol_ref as equivalent.  */
 rtx in_reg;
 rtx out_reg;
 --- 100,106 
 int inc;
 /* A reg for which reload_in is the equivalent.
If reload_in is a symbol_ref which came from
 !  reg_equiv_consant, then this is the pseudo
which has that symbol_ref as equivalent.  */
 rtx in_reg;
 rtx out_reg;
 
 Adds typo.
Yea.  I had changed the comment when I was using VEC_blah directly, then
introduced the typo when I went to using accessor macros and wanted to
change the comment back to its original form :(

I just checked in a fix for the typo.

 
 *** elimination_effects (rtx x, enum machine
 *** 3002,3011 
}
   
  }
 !   else if (reg_renumber[regno]  0  reg_equiv_constant
 !reg_equiv_constant[regno]
 !! function_invariant_p (reg_equiv_constant[regno]))
 !elimination_effects (reg_equiv_constant[regno], mem_mode);
 return;
   
   case PRE_INC:
 --- 2996,3006 
}
   
  }
 !   else if (reg_renumber[regno]  0
 !reg_equiv_constant (0)
 !reg_equiv_constant (regno)
 !! function_invariant_p (reg_equiv_constant (regno)))
 !elimination_effects (reg_equiv_constant (regno), mem_mode);
 return;
   
   case PRE_INC:
 
 Looks like this should be s/reg_equiv_constant (0)/reg_equivs != 0/.
I thought I'd fixed these.  I certainly remember looking at them and
thinking that can't be right at some point.  I'll look at it again more
closely tomorrow and take appropriate corrective action.

Thanks for the feedback,
Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNk8ZSAAoJEBRtltQi2kC7+qYH/R0i1/YC3efnLuQjZ0uieuCU
0b/sMDvP+0xngPbKKn9YviBuNhUv2/poPG1OOOQtolHeZ5c8rftecZKMRgjtmX9W
jYxhuY2OLBBTLLPo4eFDdBbEP/m90RGEBtumeIx1isPTOjQ3LVuBF+9GB+wbnr/W
OLox1MSfPT7GV3pbBeSMiHiKkw5VOeFKd4vBIbefWAPgjO0G8LFexBYWkw04j1F5
tXbuLn/cnSb4PLoRgrCxDv4XS8Wzx1YJcFtcnjn+a2t1HPUARkSYCSn8o94Yytub
k263YKwgRsqwxnrXmGght4+K2kuVjitANa6iX24DWTa+eJBruQ9ObA/f3Xs1P5E=
=SgX4
-END PGP SIGNATURE-

Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Hans-Peter Nilsson

On Tue, 29 Mar 2011, Joseph S. Myers wrote:
 Other cleanups here:

 cris*-*-none acts just like cris*-*-elf in
 config.gcc so it's appropriate to make the * subcase of cris*-*-* act
 like the -elf case;

'k.

 mmix-*-* disabled libgloss, i.e. libgloss for the
 host, which is never built anyway.

Whoops, my bad.  Odd that it worked anyway, but I'm not certain
I added that due to actual breakage.

All bits fine by me.

brgds, H-P

Random cleanups [2/4]: canonicalize ctor values

2011-03-30 Thread Michael Matz

Hi,

this came up when looking into why the static ctors contain useless trees 
(like casts).  We can simply canonicalize them while varpool analyzes 
pending decls.  It'll look at initialzers once, where we can gimplify 
them.  This requires making canonicalize_constructor_val be able to be 
called outside of functions.  And it requires the java frontend not 
leaving a dangling function decl as current (for the static ctor function 
it generates).

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* cgraphbuild.c (record_reference): Canonicalize constructor
values.
* gimple-fold.c (canonicalize_constructor_val): Accept being called
without function context.

java/
* jcf-parse.c (java_emit_static_constructor): Clear cfun and
current_function_decl.

Index: cgraphbuild.c
===
--- cgraphbuild.c   (revision 171537)
+++ cgraphbuild.c   (working copy)
@@ -53,6 +53,8 @@ record_reference (tree *tp, int *walk_su
   tree decl;
   struct record_reference_ctx *ctx = (struct record_reference_ctx *)data;
 
+restart:
+
   switch (TREE_CODE (t))
 {
 case VAR_DECL:
@@ -98,6 +100,15 @@ record_reference (tree *tp, int *walk_su
  break;
}
 
+  t = canonicalize_constructor_val (t);
+  if (t  t != *tp)
+   {
+ *tp = t;
+ goto restart;
+   }
+  else
+   t = *tp;
+
   if ((unsigned int) TREE_CODE (t) = LAST_AND_UNUSED_TREE_CODE)
return lang_hooks.callgraph.analyze_expr (tp, walk_subtrees);
   break;
Index: gimple-fold.c
===
--- gimple-fold.c   (revision 171537)
+++ gimple-fold.c   (working copy)
@@ -106,7 +106,7 @@ can_refer_decl_in_current_unit_p (tree d
   return true;
 }
 
-/* CVAL is value taken from DECL_INITIAL of variable.  Try to transorm it into
+/* CVAL is value taken from DECL_INITIAL of variable.  Try to transform it into
acceptable form for is_gimple_min_invariant.   */
 
 tree
@@ -131,7 +131,7 @@ canonicalize_constructor_val (tree cval)
  || TREE_CODE (base) == FUNCTION_DECL)
   !can_refer_decl_in_current_unit_p (base))
return NULL_TREE;
-  if (base  TREE_CODE (base) == VAR_DECL)
+  if (cfun  base  TREE_CODE (base) == VAR_DECL)
add_referenced_var (base);
   /* We never have the chance to fixup types in global initializers
  during gimplification.  Do so here.  */
Index: java/jcf-parse.c
===
--- java/jcf-parse.c.orig   2011-03-26 02:19:03.0 +0100
+++ java/jcf-parse.c2011-03-28 06:16:43.0 +0200
@@ -1725,6 +1725,8 @@ java_emit_static_constructor (void)
   DECL_STATIC_CONSTRUCTOR (decl) = 1;
   java_genericize (decl);
   cgraph_finalize_function (decl, false);
+  current_function_decl = NULL;
+  set_cfun (NULL);
 }
 }

Re: Random cleanups [1/4]

2011-03-30 Thread Diego Novillo

On Wed, Mar 30, 2011 at 21:16, Michael Matz m...@suse.de wrote:

        * tree.c (decl_init_priority_insert): Don't create entry for
        default priority.
        (decl_fini_priority_insert): Ditto.
        (fields_compatible_p, find_compatible_field): Remove.
        * tree.h (fields_compatible_p, find_compatible_field): Remove.
        * gimple.c (gimple_compare_field_offset): Adjust block comment.

OK.


Diego.

Random cleanups [3/4]: zero out DECL_VINDEX field

2011-03-30 Thread Michael Matz

Hi,

I noticed this while working on early-merging LTO.  The DECL_VINDEX slot 
of FUNCTION_DECLs is supposed to hold the numeric index of the vtable 
slot if it's a virtual function.  During parsing the C++ frontend uses it 
to hold a reference to itself, which then later is supposed to be 
transformed into the real index.  Sometimes the C++ frontend copies 
function_decls, leaving the old one reachable via various means (but not 
as a reachable function, merely in scopes and the like), but fixes up 
only the copy.

Instead of diddling with the frontend we can just as well zero out 
meaningless DECL_VINDEX slots (i.e. when they're not an integer).  It's 
used for debug info generation (when it's an integer) and as a flag ala 
this function is a virtual method by the middle-end.

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* tree.c (free_lang_data_in_decl): Zero DECL_VINDEX if it's not
an integer.
* tree.h (tree_decl_non_common.vindex): Adjust comment.

Index: tree.c
===
--- tree.c  (revision 171537)
+++ tree.c  (working copy)
@@ -4647,6 +4647,13 @@ free_lang_data_in_decl (tree decl)
   RECORD_OR_UNION_TYPE_P
   (DECL_CONTEXT (DECL_ABSTRACT_ORIGIN (decl
DECL_ABSTRACT_ORIGIN (decl) = NULL_TREE;
+
+  /* Sometimes the C++ frontend doesn't manage to transform a temporary
+ DECL_VINDEX referring to itself into a vtable slot number as it
+should.  Happens with functions that are copied and then forgotten
+about.  Just clear it, it won't matter anymore.  */
+  if (DECL_VINDEX (decl)  !host_integerp (DECL_VINDEX (decl), 0))
+   DECL_VINDEX (decl) = NULL_TREE;
 }
   else if (TREE_CODE (decl) == VAR_DECL)
 {
Index: tree.h
===
--- tree.h  (revision 171537)
+++ tree.h  (working copy)
@@ -3228,7 +3233,7 @@ struct GTY(())
   tree arguments;
   /* Almost all FE's use this.  */
   tree result;
-  /* C++ uses this in namespaces.  */
+  /* C++ uses this in namespaces and function_decls.  */
   tree vindex;
 };

Re: Random cleanups [3/4]: zero out DECL_VINDEX field

2011-03-30 Thread Diego Novillo

On Wed, Mar 30, 2011 at 21:32, Michael Matz m...@suse.de wrote:

        * tree.c (free_lang_data_in_decl): Zero DECL_VINDEX if it's not
        an integer.
        * tree.h (tree_decl_non_common.vindex): Adjust comment.

OK.


Diego.

[pph] Split streamer into pph-sreamer-out.c and pph-streamer-in.c (issue4333047)

2011-03-30 Thread Diego Novillo


No functional changes.  Just some refactoring to stop mixing up
reading and writing routines in the same file.

Tested on x86_64.

cp/ChangeLog.pph
2011-03-30  Diego Novillo  dnovi...@google.com

* Make-lang.in (CXX_AND_OBJCXX_OBJS): Add cp/pph-streamer-out.o
and cp/pph-streamer-in.o.
(cp/pph-streamer-out.o): New.
(cp/pph-streamer-in.o): New.
* pph-streamer.c (current_pph_file): Move to cp/pph-streamer-out.c.
(pph_stream_init_write): Likewise.
(pph_stream_write_ld_base): Likewise.
(pph_stream_write_ld_min): Likewise.
(pph_stream_write_tree_vec): Likewise.
(pph_stream_write_cxx_binding_1): Likewise.
(pph_stream_write_cxx_binding): Likewise.
(pph_stream_write_class_binding): Likewise.
(pph_stream_write_label_binding): Likewise.
(pph_stream_write_binding_level): Likewise.
(pph_stream_write_c_language_function): Likewise.
(pph_stream_write_language_function): Likewise.
(pph_stream_write_ld_fn): Likewise.
(pph_stream_write_ld_ns): Likewise.
(pph_stream_write_ld_parm): Likewise.
(pph_stream_write_lang_specific_data): Likewise.
(pph_stream_write_tree): Likewise.
(pph_stream_pack_value_fields): Likewise.
(pph_stream_begin_section): Likewise.
(pph_stream_write): Likewise.
(pph_stream_end_section): Likewise.
(pph_stream_write_header): Likewise.
(pph_stream_write_body): Likewise.
(pp_stream_flush_buffers): New.  Factor out of pph_stream_close.
(pph_get_section_data): Move to cp/pph-streamer-in.c.
(pph_free_section_data): Likewise.
(pph_stream_init_read): Likewise.
(pph_stream_read_tree): Likewise.
(pph_stream_unpack_value_fields): Likewise.
* pph-streamer.h (pph_stream_flush_buffers): Declare.
(pph_stream_init_write): Declare.
(pph_stream_write_tree): Declare.
(pph_stream_pack_value_fields): Declare.
(pph_stream_init_read): Declare.
(pph_stream_read_tree): Declare.
(pph_stream_unpack_value_fields): Declare.
* pph-streamer-out.c: New file.
* pph-streamer-in.c: New file.

diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 70548c6..143df46 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -81,7 +81,8 @@ CXX_AND_OBJCXX_OBJS = cp/call.o cp/decl.o cp/expr.o cp/pt.o 
cp/typeck2.o \
  cp/typeck.o cp/cvt.o cp/except.o cp/friend.o cp/init.o cp/method.o \
  cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o \
  cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o \
- cp/cp-gimplify.o cp/pph.o cp/pph-streamer.o tree-mudflap.o $(CXX_C_OBJS)
+ cp/cp-gimplify.o cp/pph.o cp/pph-streamer.o cp/pph-streamer-out.o \
+ cp/pph-streamer-in.o tree-mudflap.o $(CXX_C_OBJS)
 
 # Language-specific object files for C++.
 CXX_OBJS = cp/cp-lang.o c-family/stub-objc.o $(CXX_AND_OBJCXX_OBJS)
@@ -337,3 +338,11 @@ cp/pph.o: cp/pph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
$(CPPLIB_H) \
 cp/pph-streamer.o: cp/pph-streamer.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) $(CXX_PPH_STREAMER_H) \
$(CXX_PPH_H) $(TREE_PASS_H) version.h cppbuiltin.h tree-iterator.h
+cp/pph-streamer-out.o: cp/pph-streamer-out.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) \
+   $(CXX_PPH_STREAMER_H) $(CXX_PPH_H) $(TREE_PASS_H) version.h \
+   cppbuiltin.h tree-iterator.h
+cp/pph-streamer-in.o: cp/pph-streamer-in.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) \
+   $(CXX_PPH_STREAMER_H) $(CXX_PPH_H) $(TREE_PASS_H) version.h \
+   cppbuiltin.h tree-iterator.h
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
new file mode 100644
index 000..aaa0a6c
--- /dev/null
+++ b/gcc/cp/pph-streamer-in.c
@@ -0,0 +1,155 @@
+/* Routines for reading PPH data.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Diego Novillo dnovi...@google.com.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+#include config.h
+#include system.h
+#include coretypes.h
+#include tree.h
+#include langhooks.h
+#include tree-iterator.h
+#include tree-pretty-print.h
+#include lto-streamer.h
+#include pph-streamer.h

Random cleanups [4/4]: Streamlining streamer

2011-03-30 Thread Michael Matz

Hi,

I fear I wasn't as thorough in also splitting this one into several 
patches, but the different cleanups are at least mostly in different 
files.  They are:

* lto-lang remembers all builtin decls in a local list, to be returned
  by the getdecls langhook.  But as we have our own write_globals langhook
  this isn't actually called (except by dbxout.c), so there's no point in 
  remembering.

* lto.c:lto_materialize_function has code to read in the function body 
  sections, do something with them in non-wpa mode, and discard them then.
  There's no point in even reading them in in non-wpa mode (except for a 
  dubious error message that rather is worth an assert).

* gimple.c:gimple_type_leader_entry is a fixed length cache for speeding 
  up our type merging machinery.  It can hold references to many meanwhile 
  merged trees, interferring with the wish of free up much memory with a 
  ggc_collect with early-merging LTO.  We can simply make it deletable.

* ipa-inline.c: some tidying in not calling a macro with function call 
  arguments, and calling a costly function only after early-outs.

* lto-streamer-out.c : it writes out and compares strings character by 
  character.  memcmp and output_data_stream work just as well

* lto-streamer: output_unreferenced_globals writes out all global varpool 
  decls.  The reading side simply reads over all of them, and ignores 
  them.  This was supposed to help symbol resolution, and it probably once 
  did.  But since some time we properly emit varpool and cgraph nodes, and 
  references between them, and a proper symtab.  There's no need for 
  emitting these trees again.

* lto-streamer: the following changes the bytecode:
  1: all indices  into the cache are unsigned, hence we should say 
 so, instead of casting casts back and forth
  2: trees are only appended to the cache, when writing out.  When reading
 in we read in all trees in the stream one after the other, also 
 appending to the cache.  References to existing trees _always_ are to 
 - well - existing trees, hence to those already emitted earlier in 
 the stream, i.e. with a smaller offset, and more importantly with a 
 known index even at reader side.

 So, the offset never is used, so remove that and all associated 
 tracking and params.
  3: for the same reason we also don't need to stream the index that new
 trees get in the cache.  They will get exactly the ones they also had 
 when writing out.  We could use it as consistency check, but we 
 stream the expected tree-node for back-references for that already.

 Obviously we do need to stream the index in back references (aka 
 pickled references).

 (the index could change if there's a different set of nodes preloaded 
 into the cache between writing out and reading in.  But that would 
 have much worse problems already, silently overwriting slots with 
 trees from the stream; we should do away with the preloaded nodes,
 and instead rely on type merging to get canonical versions of the 
 builtin trees)

Not streaming offset and index for most trees obviously shortens the 
bytecode somewhat but I don't have statistics on how much.  Not much would 
be my guess.

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* lto-streamer.h (struct lto_streamer_cache_d): Remove offsets
and next_slot members.
(lto_streamer_cache_insert, lto_streamer_cache_insert_at,
lto_streamer_cache_lookup, lto_streamer_cache_get): Adjust prototypes.
(lto_streamer_cache_append): Declare.
* lto-streamer.c (lto_streamer_cache_add_to_node_array): Use
unsigned index, remove offset parameter, ensure that we append
or update existing entries.
(lto_streamer_cache_insert_1): Use unsigned index, remove offset_p
parameter, update next_slot for append.
(lto_streamer_cache_insert): Use unsigned index, remove offset_p
parameter.
(lto_streamer_cache_insert_at): Likewise.
(lto_streamer_cache_append): New function.
(lto_streamer_cache_lookup): Use unsigned index.
(lto_streamer_cache_get): Likewise.
(lto_record_common_node): Don't test tree_node_can_be_shared.
(preload_common_node): Adjust call to lto_streamer_cache_insert.
(lto_streamer_cache_delete): Don't free offsets member.
* lto-streamer-out.c (eq_string_slot_node): Use memcmp.
(lto_output_string_with_length): Use lto_output_data_stream.
(lto_output_tree_header): Remove ix parameter, don't write it.
(lto_output_builtin_tree): Likewise.
(lto_write_tree): Adjust callers to above, don't track and write
offset, write unsigned index.
(output_unreferenced_globals): Don't emit all global vars.
(write_global_references): Use unsigned indices.
(lto_output_decl_state_refs): Likewise.

Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns

2011-03-30 Thread Carrot Wei

Hi Ramana

On Wed, Mar 30, 2011 at 6:35 AM, Ramana Radhakrishnan
ramana.radhakrish...@linaro.org wrote:
 Hi Carrot,


        How about adding an alternative only enabled for T2 that uses the `l'
 constraint and inventing new constraints for some of the constant values
 that are valid for 16 bit instructions since the `I' and `L' constraints
 have different meanings depending on whether TARGET_32BIT is valid or not ?
 We could then set the value of the length attribute accordingly. I don't
 think we can change the meaning of the I and L constraints very easily given
 the amount of churn that might be needed

You are right. Now the logic is much clearer by splitting the constraints.

        I don't think this and the other conditional branch instruction
 changes are correct. This could end up being the last instruction in an IT
 block and will automatically end up getting the 32 bit encoding and end up
 causing trouble with the length calculations. Remember the 16 bit encoding
 for the conditional instruction can't be used as the last instruction in an
 IT block.

According to arm architecture reference manual, chapter A8.6.16, neither 16 bit
nor 32 bit conditional branch can be used in IT block. Only unconditional branch
can be used as the last instruction in IT block, and it isn't related
to instruction
length. So we don't need to care about branch instruction encoding in IT block.


        (i  num_saves  (hi_reg == 0)) - what's the point of going through
 the register list when hi_reg != 0 in this case ? A comment to explain that
 the length calculation *must* be kept in sync with the template above might
 be useful.
done.

The following patch has been tested on arm qemu.


ChangeLog:
2011-03-31  Wei Guozhi  car...@google.com

PR target/47855
* config/arm/arm.md (arm_cmpsi_insn): Compute attr length.
(arm_cond_branch): Likewise.
(arm_cond_branch_reversed): Likewise.
(arm_jump): Likewise.
(push_multi): Likewise.
* config/arm/constraints.md (Py): New constraint.


Index: constraints.md
===
--- constraints.md  (revision 171337)
+++ constraints.md  (working copy)
@@ -31,7 +31,7 @@
 ;; The following multi-letter normal constraints have been used:
 ;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd
-;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px
+;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px, Py

 ;; The following memory constraints have been used:
 ;; in ARM/Thumb-2 state: Q, Ut, Uv, Uy, Un, Um, Us
@@ -189,6 +189,11 @@
   (and (match_code const_int)
(match_test TARGET_THUMB2  ival = -7  ival = -1)))

+(define_constraint Py
+  @internal In Thumb-2 state a constant in the range 0 to 255
+  (and (match_code const_int)
+   (match_test TARGET_THUMB2  ival = 0  ival = 255)))
+
 (define_constraint G
  In ARM/Thumb-2 state a valid FPA immediate constant.
  (and (match_code const_double)
Index: arm.md
===
--- arm.md  (revision 171337)
+++ arm.md  (working copy)
@@ -7109,13 +7109,21 @@

 (define_insn *arm_cmpsi_insn
   [(set (reg:CC CC_REGNUM)
-   (compare:CC (match_operand:SI 0 s_register_operand r,r)
-   (match_operand:SI 1 arm_add_operandrI,L)))]
+   (compare:CC (match_operand:SI 0 s_register_operand l,r,r,r)
+   (match_operand:SI 1 arm_add_operandPy,r,I,L)))]
   TARGET_32BIT
   @
cmp%?\\t%0, %1
+   cmp%?\\t%0, %1
+   cmp%?\\t%0, %1
cmn%?\\t%0, #%n1
-  [(set_attr conds set)]
+  [(set_attr conds set)
+   (set (attr length)
+ (if_then_else
+   (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
+   (lt (symbol_ref which_alternative) (const_int 2)))
+   (const_int 2)
+   (const_int 4)))]
 )

 (define_insn *cmpsi_shiftsi
@@ -7286,7 +7294,14 @@
   return \b%d1\\t%l0\;
   
   [(set_attr conds use)
-   (set_attr type branch)]
+   (set_attr type branch)
+   (set (attr length)
+   (if_then_else
+  (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
+   (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+(le (minus (match_dup 0) (pc)) (const_int 256
+  (const_int 2)
+  (const_int 4)))]
 )

 (define_insn *arm_cond_branch_reversed
@@ -7305,7 +7320,14 @@
   return \b%D1\\t%l0\;
   
   [(set_attr conds use)
-   (set_attr type branch)]
+   (set_attr type branch)
+   (set (attr length)
+   (if_then_else
+  (and (ne (symbol_ref TARGET_THUMB2) (const_int 0))
+   (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+(le (minus (match_dup 0) (pc)) (const_int 256
+  (const_int 2)
+  (const_int 4)))]
 )



@@ -7757,7 +7779,14 @@
 return \b%?\\t%l0\;
   }
   
-  [(set_attr predicable yes)]
+  [(set_attr predicable yes)
+   (set (attr length)
+   (if_then_else
+

Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Chung-Lin Tang

On 2011/3/30 05:28 PM, Richard Earnshaw wrote:
 
 On Wed, 2011-03-30 at 15:35 +0800, Chung-Lin Tang wrote:
 On 2011/3/30 上午 12:23, Richard Earnshaw wrote:

 On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
 On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
 On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
 On 2011/3/24 06:51 PM, Richard Earnshaw wrote:

 On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
 Hi,
 PR48250 happens under TARGET_NEON, where DImode is included within the
 valid NEON modes. This turns the range of legitimate constant indexes 
 to
 step-4 (coproc load/store), thus arm_legitimize_reload_address() when
 trying to decompose the [reg+index] reload address into
 [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
 part is not aligned to 4.

 I'm not sure why the current DImode index is computed as:
 low = ((val  0xf) ^ 0x8) - 0x8;  the sign-extending into negative
 values, then subtracting back, actually creates further off indexes.
 e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].


 Hysterical Raisins... the code there was clearly written for the days
 before we had LDRD in the architecture.  At that time the most efficient
 way to load a 64-bit object was to use the LDM{ia,ib,da,db}
 instructions.  The computation here was (I think), intended to try and
 make the most efficient use of an add/sub instruction followed by
 LDM/STM offsetting.  At that time the architecture had no unaligned
 access either, so dealing with 64-bit that were less than 32-bit aligned
 (in those days 32-bit was the maximum alignment) probably wasn't
 considered, or couldn't even get through to reload.


 I see it now. The code in output_move_double() returning assembly for
 ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.

 I have changed the patch to let the new code handle the TARGET_LDRD case
 only.  The pre-LDRD case is still handled by the original code, with an
 additional  ~0x3 for aligning the offset to 4.

 I've also added a comment for the pre-TARGET_LDRD case. Please see if
 the description is accurate enough.

 My patch changes the index decomposing to a more straightforward way; 
 it
 also sort of outlines the way the other reload address indexes are
 broken by using and-masks, is not the most effective.  The address is
 computed by addition, subtracting away the parts to obtain low+high
 should be the optimal way of giving the largest computable index range.

 I have included a few Thumb-2 bits in the patch; I know currently
 arm_legitimize_reload_address() is only used under TARGET_ARM, but I
 guess it might eventually be turned into TARGET_32BIT.


 I think this needs to be looked at carefully on ARMv4/ARMv4T to check
 that it doesn't cause regressions there when we don't have LDRD in the
 instruction set.

 I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
 Okay for trunk if no regressions?

 Thanks,
 Chung-Lin

  PR target/48250
  * config/arm/arm.c (arm_legitimize_reload_address): Adjust
  DImode constant index decomposing under TARGET_LDRD. Clear
  lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
  comment for !TARGET_LDRD case.

 This looks technically correct, but I can't help feeling that trying to
 deal with the bottom two bits being non-zero is not really worthwhile.
 This hook is an optimization that's intended to generate better code
 than the default mechanisms that reload provides.  It is allowed to
 fail, but it must say so if it does (by returning false).

 What this hook is trying to do for DImode is to take an address of the
 form (reg + TOO_BIG_CONST) and turn it into two instructions that are
 legitimate:

   tmp = (REG + LEGAL_BIG_CONST)
   some_use_of (mem (tmp + SMALL_LEGAL_CONST))

 The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
 instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
 impossible in ARM state) that pushing the bottom two bits of the address
 into LEGAL_BIG_CONST part of the offset is going to lead to a better
 code sequence.  

 So I think it would be more sensible to just return false if we have a
 DImode address with the bottom 2 bits non-zero and then let the normal
 reload recovery mechanism take over.

 It is supposed to provide better utilization of the full range of
 LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not sure, but I guess reload
 will resolve it with the reg+LEGAL_BIG_CONST part only, using only (mem
 (reg)) for the load/store (correct me if I'm wrong).

 Also, the new code slighty improves the reloading, for example currently
 [reg+64] is broken into [(reg+72)-8], creating an additional unneeded
 reload, which is certainly not good when we have ldrd/strd available.


 Sorry, didn't mean to suggest that we don't want to do something better
 for addresses that are a multiple of 4, just that for addresses that
 aren't (at least) word-aligned that we should just bail as the code in
 that case won't benefit

71 matches

Mail list logo