Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Paolo Bonzini

Ok, you removed ignore_address_wrap_around, so we're almost there.

On 07/28/2011 07:59 PM, H.J. Lu wrote:

@@ -712,7 +715,16 @@ convert_modes (enum machine_mode mode, enum machine_mode 
oldmode, rtx x, int uns
if (GET_CODE (x) == SUBREG&&  SUBREG_PROMOTED_VAR_P (x)
&&  GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)))>= GET_MODE_SIZE (mode)
&&  SUBREG_PROMOTED_UNSIGNED_P (x) == unsignedp)
-x = gen_lowpart (mode, x);
+{
+  temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
+  if (temp)
+   x = temp;
+  else
+   {
+ gcc_assert (!no_emit);
+ x = gen_lowpart (mode, x);
+   }
+}


+{
+   /* gen_lowpart_no_emit should always succeed here.  */
+   x = rtl_hooks.gen_lowpart_no_emit (mode, x);
+}



if (GET_MODE (x) != VOIDmode)
  oldmode = GET_MODE (x);
@@ -776,6 +788,10 @@ convert_modes (enum machine_mode mode, enum machine_mode 
oldmode, rtx x, int uns
  return gen_int_mode (val, mode);
}

+  temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
+  if (temp)
+   return temp;
+  gcc_assert (!no_emit);
return gen_lowpart (mode, x);


Right now, gen_lowpart_no_emit will never return NULL, so these tests in 
convert_modes are dead.  Instead, please include in your patch mine at 
http://permalink.gmane.org/gmane.comp.gcc.patches/242085 and adjust as 
follows.


+  temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
+  if (no_emit)
+return rtl_hooks.gen_lowpart_no_emit (mode, x);
+  else
+return gen_lowpart (mode, x);


  }


If it does not work, PLEASE say why instead of posting another "updated 
patch".


Paolo


[google] Backport r175347 from trunk to google/gcc-4_6 (issue4835041)

2011-07-28 Thread Guozhi Wei
Hi

This trunk patch fixed a test failure in target arm. So I want to backport
it to google/gcc-4_6.

Tested on both x86 and arm with
make check-gcc RUNTESTFLAGS="--target_board=arm-sim/arch=armv7-a 
tree-ssa.exp=asm-1.c"
make check-gcc RUNTESTFLAGS="tree-ssa.exp=asm-1.c"

OK for google/gcc-4_6?

thanks
Carrot


2011-07-29  Guozhi Wei  

Backport r175347 from trunk.

2011-06-23  Jakub Jelinek  

PR testsuite/49512
* gcc.dg/tree-ssa/asm-1.c: Use -fdump-tree-optimized-nouid
instead of -fdump-tree-optimized.


Index: gcc.dg/tree-ssa/asm-1.c
===
--- gcc.dg/tree-ssa/asm-1.c (revision 176911)
+++ gcc.dg/tree-ssa/asm-1.c (working copy)
@@ -2,7 +2,7 @@
as a def.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-optimized" } */
+/* { dg-options "-O -fdump-tree-optimized-nouid" } */
 
 void f()
 {

--
This patch is available for review at http://codereview.appspot.com/4835041


[PATCH] Use HOST_WIDE_INTs in gcd and least_common_multiple.

2011-07-28 Thread Sebastian Pop
Hi Richi,

On Mon, 31 Jan 2011, Richard Guenther wrote:

> On Fri, Jan 28, 2011 at 9:31 PM, Sebastian Pop  wrote:
> > Hi Joseph,
> >
> > Thanks for your careful review. =A0This patch fixes the remaining
> > problems. =A0The patch passed regstrap on amd64-linux. =A0Ok for trunk?
> I don't think adding asserts at this stage (nor including diagnostic-core
> here in general) is sensible.  Can't we do without them?

Would this patch be ok now that we are again in Stage1?
I am regstrapping this patch again on amd64-linux.

Thanks,
Sebastian

2011-01-28  Sebastian Pop  
Joseph Myers  

* Makefile.in (hwint.o): Depend on DIAGNOSTIC_CORE_H.
* hwint.c: Include diagnostic-core.h.
(abs_hwi): New.
(gcd): Moved here...
(pos_mul_hwi): New.
(mul_hwi): New.
(least_common_multiple): Moved here...
* hwint.h (gcd): ... from here.
(least_common_multiple): ... from here.
(HOST_WIDE_INT_MIN): New.
(HOST_WIDE_INT_MAX): New.
(abs_hwi): Declared.
(gcd): Declared.
(pos_mul_hwi): Declared.
(mul_hwi): Declared.
(least_common_multiple): Declared.
* omega.c (check_pos_mul): Removed.
(check_mul): Removed.
(omega_solve_geq): Use pos_mul_hwi instead of check_pos_mul and
mul_hwi instead of check_mul.
---
 gcc/ChangeLog   |   24 ++
 gcc/Makefile.in |2 +-
 gcc/hwint.c |   74 +++
 gcc/hwint.h |   37 ++
 gcc/omega.c |   47 ++-
 5 files changed, 116 insertions(+), 68 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0f72153..33bad2c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,27 @@
+2011-01-28  Sebastian Pop  
+   Joseph Myers  
+
+   * Makefile.in (hwint.o): Depend on DIAGNOSTIC_CORE_H.
+   * hwint.c: Include diagnostic-core.h.
+   (abs_hwi): New.
+   (gcd): Moved here...
+   (pos_mul_hwi): New.
+   (mul_hwi): New.
+   (least_common_multiple): Moved here...
+   * hwint.h (gcd): ... from here.
+   (least_common_multiple): ... from here.
+   (HOST_WIDE_INT_MIN): New.
+   (HOST_WIDE_INT_MAX): New.
+   (abs_hwi): Declared.
+   (gcd): Declared.
+   (pos_mul_hwi): Declared.
+   (mul_hwi): Declared.
+   (least_common_multiple): Declared.
+   * omega.c (check_pos_mul): Removed.
+   (check_mul): Removed.
+   (omega_solve_geq): Use pos_mul_hwi instead of check_pos_mul and
+   mul_hwi instead of check_mul.
+
 2010-12-02  Sebastian Pop  
 
PR middle-end/45948
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 9a8262a..b2f4bad 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2820,7 +2820,7 @@ toplev.o : toplev.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
$(TM_H) $(TREE_H) \
  -DTARGET_NAME=\"$(target_noncanonical)\" \
  -c $(srcdir)/toplev.c $(OUTPUT_OPTION)
 
-hwint.o : hwint.c $(CONFIG_H) $(SYSTEM_H)
+hwint.o : hwint.c $(CONFIG_H) $(SYSTEM_H) $(DIAGNOSTIC_CORE_H)
 
 passes.o : passes.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
$(RTL_H) $(FUNCTION_H) $(FLAGS_H) xcoffout.h $(INPUT_H) $(INSN_ATTR_H) 
output.h \
diff --git a/gcc/hwint.c b/gcc/hwint.c
index 85c1326..a128dc1 100644
--- a/gcc/hwint.c
+++ b/gcc/hwint.c
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "config.h"
 #include "system.h"
+#include "diagnostic-core.h"
 
 #if GCC_VERSION < 3004
 
@@ -98,3 +99,76 @@ ffs_hwi (unsigned HOST_WIDE_INT x)
 }
 
 #endif /* GCC_VERSION < 3004 */
+
+/* Compute the absolute value of X.  */
+
+HOST_WIDE_INT
+abs_hwi (HOST_WIDE_INT x)
+{
+  gcc_checking_assert (x != HOST_WIDE_INT_MIN);
+  return x >= 0 ? x : -x;
+}
+
+/* Compute the greatest common divisor of two numbers A and B using
+   Euclid's algorithm.  */
+
+HOST_WIDE_INT
+gcd (HOST_WIDE_INT a, HOST_WIDE_INT b)
+{
+  HOST_WIDE_INT x, y, z;
+
+  x = abs_hwi (a);
+  y = abs_hwi (b);
+
+  while (x > 0)
+{
+  z = y % x;
+  y = x;
+  x = z;
+}
+
+  return y;
+}
+
+/* For X and Y positive integers, return X multiplied by Y and check
+   that the result does not overflow.  */
+
+HOST_WIDE_INT
+pos_mul_hwi (HOST_WIDE_INT x, HOST_WIDE_INT y)
+{
+  if (x != 0)
+gcc_checking_assert ((HOST_WIDE_INT_MAX) / x >= y);
+
+  return x * y;
+}
+
+/* Return X multiplied by Y and check that the result does not
+   overflow.  */
+
+HOST_WIDE_INT
+mul_hwi (HOST_WIDE_INT x, HOST_WIDE_INT y)
+{
+  gcc_checking_assert (x != HOST_WIDE_INT_MIN
+  && y != HOST_WIDE_INT_MIN);
+
+  if (x >= 0)
+{
+  if (y >= 0)
+   return pos_mul_hwi (x, y);
+
+  return -pos_mul_hwi (x, -y);
+}
+
+  if (y >= 0)
+return -pos_mul_hwi (-x, y);
+
+  return pos_mul_hwi (-x, -y);
+}
+
+/* Compute the least common multiple of two numbers A and B .  */
+
+HOST_WIDE_INT
+least_common_multiple (HOST_WIDE_INT

PATCH: Add a testcase for PR rtl-optimization/47958

2011-07-28 Thread H.J. Lu
PR rtl-optimization/47958 is fixed now.  I checked in this patch
to add a testcase.

H.J.
---
Index: gcc.dg/torture/pr47958-1.c
===
--- gcc.dg/torture/pr47958-1.c  (revision 0)
+++ gcc.dg/torture/pr47958-1.c  (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do assemble } */
+
+void (*foo[6][6]) (int);
+void bar (hdR)
+int hdR;
+{ }
+void xxx ()
+{
+unsigned int i, j;
+for (i = 0; i < 6; ++i)
+   for (j = 0; j < 6; ++j)
+foo [i][j] = bar;
+}
Index: ChangeLog
===
--- ChangeLog   (revision 176913)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2011-07-28  H.J. Lu  
+
+   PR rtl-optimization/47958
+   * gcc.dg/torture/pr47958-1.c: New.
+
 2011-07-29  Wei Guozhi  
 
PR rtl-optimization/49799


[google] Do not declare pmu and sampling rate related vars for profile-use build (issue4832042)

2011-07-28 Thread David Li
The following trivial patch will be applied to google branches.
Bootstrap and tessted on x86-64/linux

2011-07-28  David Li  

* coverage.c (coverage_init): Remove checking of profile-use
flags.

Index: coverage.c
===
--- coverage.c  (revision 176765)
+++ coverage.c  (working copy)
@@ -1952,9 +1952,10 @@ coverage_init (const char *filename, con
 static bool
 profiling_enabled_p (void)
 {
-  return flag_pmu_profile_generate || profile_arc_flag ||
-  flag_profile_generate_sampling || flag_test_coverage ||
-  flag_branch_probabilities || flag_profile_reusedist;
+  return flag_pmu_profile_generate
+   || profile_arc_flag
+   || flag_profile_generate_sampling
+   || flag_profile_reusedist;
 }
 
 /* Construct variables for PMU profiling.

--
This patch is available for review at http://codereview.appspot.com/4832042


Re: [pph] Free buffers used during tree encoding/decoding

2011-07-28 Thread Diego Novillo
On Thu, Jul 28, 2011 at 16:30, Lawrence Crowl  wrote:
> I'm getting massive failures after incorporating this change:
>
>   bytecode stream: trying to read 1735 bytes after the end of the
>   input buffer
>
> where the number of bytes changes.  Suggestions?

Odd.  I'm getting the usual results with:

$ git svn info
Path: .
URL: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph/gcc
Repository Root: svn+ssh://gcc.gnu.org/svn/gcc
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 176671
Node Kind: directory
Schedule: normal
Last Changed Author: gchare
Last Changed Rev: 176671
Last Changed Date: 2011-07-22 21:04:48 -0400 (Fri, 22 Jul 2011)

Perhaps a file did not get rebuilt after you updated your tree?  That
may point to a Makefile dependency bug.  Or maybe you have some local
patch?


Diego.


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Thu, Jul 28, 2011 at 12:09:51PM -0700, Richard Henderson wrote:
> Well, even if we're not able to hoist the R2 store, we may be able
> to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns
> in the stream.

You'd need to mark every non-local call with something that says
R2 may be saved, effectively duplicating md_frob_update in dwarf.
I guess that is possible even without extending our eh encoding, but
each call would have at least 6 bytes added to eh_frame:
   DW_CFA_expression, 2, 3, DW_OP_skip, offset_to_r2_prog
and you'd need to emit multiple copies of "r2_prog" for functions that
have a lot of calls, since the offset is limited to +/-32k.  I think
that would inflate the size of .eh_frame too much, and slow down
handling of exceptions dramatically.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PATCH: Fix config/i386/morestack.S for x32

2011-07-28 Thread Ian Lance Taylor
"H.J. Lu"  writes:

> On Thu, Jul 28, 2011 at 1:07 PM, Richard Henderson  wrote:
>> On 07/28/2011 12:42 PM, H.J. Lu wrote:
>>> +#ifdef __LP64__
>>>       movq    %rax,%fs:0x70           # Save the new stack boundary.
>>> +#else
>>> +     movl    %eax,%fs:0x40           # Save the new stack boundary.
>>> +#endif
>>
>> Please macro-ize this.
>>
>>
>
> Here is is the updated patch.  OK for trunk?

This is OK.

Thanks.

Ian


Re: [pph] Put tinst_level list in forward order (issue4823059)

2011-07-28 Thread gchare

LGTM

http://codereview.appspot.com/4823059/


RE: PING: [PATCH, ARM, iWMMXt][5/5]: pipeline description

2011-07-28 Thread Xinyu Qi
Ping.

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01106.html

At 2011-07-14 15:50:55,"Xinyu Qi"  wrote:
> 
> > Hi,
> >
> > It is the fifth part of iWMMXt maintenance.
> >
> 
> *config/arm/t-arm (MD_INCLUDES): Add marvell-f-iwmmxt.md.
> *config/arm/marvell-f-iwmmxt.md: New file.
> *config/arm/arm.md (marvell-f-iwmmxt.md): Include.
> 
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][4/5]: WMMX machine description

2011-07-28 Thread Xinyu Qi
Ping.

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01105.html

At 2011-07-14 15:45:09,"Xinyu Qi"  wrote:
> > Hi,
> >
> > It is the fourth part of iWMMXt maintenance.
> >
> 
> Since "*cond_iwmmxt_movsi_insn" would be got rid of soon, I keep it unchanged.
> 
> *config/arm/arm.c (arm_output_iwmmxt_shift_immediate): New function.
>  (arm_output_iwmmxt_tinsr): Ditto.
> *config/arm/arm-protos.h (arm_output_iwmmxt_shift_immediate): New
> prototype.
>  (arm_output_iwmmxt_tinsr): Ditto.
> *config/arm/iwmmxt.md (WCGR0, WCGR1, WCGR2, WCGR3): New constant.
>  (iwmmxt_psadbw, iwmmxt_walign, iwmmxt_tmrc, iwmmxt_tmcr): Delete.
>  (iwmmxt_tbcstqi, iwmmxt_tbcsthi, iwmmxt_tbcstsi, *iwmmxt_clrv8qi,
> *iwmmxt_clrv4hi, *iwmmxt_clrv2si): Rename...
>  (tbcstv8qi, tbcstv4hi, tbsctv2si, iwmmxt_clrv8qi, iwmmxt_clrv4hi,
> iwmmxt_clrv2si): ...New pattern.
>  (*and3_iwmmxt, *ior3_iwmmxt, *xor3_iwmmxt, rori3,
> ashri3_iwmmxt, lshri3_iwmmxt, ashli3_iwmmxt,
> iwmmxt_waligni, iwmmxt_walignr, iwmmxt_walignr0, iwmmxt_walignr1,
> iwmmxt_walignr2, iwmmxt_walignr3, iwmmxt_setwcgr0, iwmmxt_setwcgr1,
> iwmmxt_setwcgr2, iwmmxt_setwcgr3, iwmmxt_getwcgr0, iwmmxt_getwcgr1,
> iwmmxt_getwcgr2, iwmmxt_getwcgr3): New pattern.
> (All instruction patterns): Add wtype attribute.
>  (*iwmmxt_arm_movdi, *iwmmxt_movsi_insn, iwmmxt_uavgrndv8qi3,
> iwmmxt_uavgrndv4hi3, iwmmxt_uavgv8qi3, iwmmxt_uavgv4hi3, iwmmxt_tinsrb,
> iwmmxt_tinsrh, iwmmxt_tinsrw, eqv8qi3, eqv4hi3, eqv2si3, gtuv8qi3, gtuv4hi3,
> gtuv2si3, gtv8qi3, gtv4hi3, gtv2si3, iwmmxt_wunpckihb, iwmmxt_wunpckihh,
> iwmmxt_wunpckihw, iwmmxt_wunpckilb, iwmmxt_wunpckilh, iwmmxt_wunpckilw,
> iwmmxt_wunpckehub, iwmmxt_wunpckehuh, iwmmxt_wunpckehuw, iwmmxt_wunpckehsb,
> iwmmxt_wunpckehsh, iwmmxt_wunpckehsw, iwmmxt_wunpckelub, iwmmxt_wunpckeluh,
> iwmmxt_wunpckeluw, iwmmxt_wunpckelsb, iwmmxt_wunpckelsh, iwmmxt_wunpckelsw,
> iwmmxt_wmadds, iwmmxt_wmaddu, iwmmxt_wsadb, iwmmxt_wsadh, iwmmxt_wsadbz,
> iwmmxt_wsadhz): Revise.
>  (iwmmxt2.md): Include.
> *config/arm/iwmmxt2.md: New file.
> *config/arm/iterators.md (VMMX2): New mode_iterator.
> *config/arm/arm.md (wtype): New attribute.
> *config/arm/t-arm (MD_INCLUDES): Add iwmmxt2.md.
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][3/5]: built in define and expand

2011-07-28 Thread Xinyu Qi
Ping.

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01103.html

At 2011-07-14 15:38:47,"Xinyu Qi"  wrote: 
> > Hi,
> >
> > It is the third part of iWMMXt maintenance.
> >
> 
> *config/arm/arm.c (enum arm_builtins): Built-in fcode.
>  (builtin_description bdesc_2arg): Built in declare.
>  (builtin_description bdesc_1arg): Ditto.
>  (arm_init_iwmmxt_builtins): Built in initialize.
>  (arm_expand_builtin): Built in expand.
> 
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][2/5]: intrinsic head file change

2011-07-28 Thread Xinyu Qi
Ping. 

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01101.html

At 2011-07-14 15:38:04,"Xinyu Qi"  wrote:
> > Hi,
> >
> > It is the second part of iWMMXt maintenance.
> 
> 
> *config/arm/mmintrin.h: Revise.
> 
> Thanks,
> Xinyu


RE: PING: [PATCH, ARM, iWMMXt][1/5]: ARM code generic change

2011-07-28 Thread Xinyu Qi
Ping.

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01100.html

At 2011-07-14 15:35:52,"Xinyu Qi"  wrote:
> > > Hi,
> > >
> > > It is the first part of iWMMXt maintenance.
> > >
> > > *config/arm/arm.c (arm_option_override):
> > >   Enable iWMMXt with VFP. iWMMXt and NEON are incompatible.
> > iWMMXt unsupported under Thumb-2 mode.
> > >   (arm_expand_binop_builtin): Accept immediate op (with mode VOID)
> > > *config/arm/arm.md:
> > >   Resettle include location of iwmmxt.md so that *arm_movdi
> > and *arm_movsi_insn could be used when iWMMXt is enabled.
> >
> > With the current work in trunk to handle enabled attributes
> > and per-alternative predicable attributes (Thanks Bernd) we
> > should be able to get rid of *cond_iwmmxt_movsi_insn"  in
> > iwmmxt.md file. It's not a matter for this patch but for a
> > follow-up patch.
> >
> > Actually we should probably do the same for the various insns
> > that are dotted around all over the place with final
> > conditions that prevent matching - atleast makes the backend
> > description slightly smaller :).
> >
> > >   Add pipeline description file include.
> >
> > It is enough to say
> >
> >  (): Include.
> >
> > in the changelog entry.
> >
> > The include for the pipeline description file should be with
> > the patch that you add this in i.e. patch #5. Please add this
> > to MD_INCLUDES in t-arm as well.
> >
> > Also as a general note, please provide a correct Changelog entry.
> >
> > This is not the format that we expect Changelog entries to be in.
> > Please look at the coding standards on the website for this
> > or at other patches submitted with respect to Changelog
> > entries. Please fix this for each patch in the patch stack.
> >
> >
> > cheers
> > Ramana
> 
> Thanks for reviewing. I have updated the patches and the Changelog.
> 
> *config/arm/arm.c (arm_option_override): Enable iWMMXt with VFP.
>  (arm_expand_binop_builtin): Accept VOIDmode op.
> *config/arm/arm.md (*arm_movdi, *arm_movsi_insn): Remove
> condition !TARGET_IWMMXT.
>  (iwmmxt.md): Include location.
> 
> Thanks,
> Xinyu


Re: [PATCH] Handle vectorization of invariant loads (PR46787)

2011-07-28 Thread H.J. Lu
On Mon, Jul 4, 2011 at 6:28 AM, H.J. Lu  wrote:
> On Wed, Jun 29, 2011 at 4:19 AM, Richard Guenther  wrote:
>>
>> The following patch makes us handle invariant loads during vectorization.
>> Dependence analysis currently isn't clever enough to disambiguate them
>> thus we insert versioning-for-alias checks.  For the testcase hoisting
>> the load is still always possible though, and for a read-after-write
>> dependence it would be possible for the vectorized loop copy as the
>> may-aliasing write is varying by the scalar variable size.
>>
>> The existing code for vectorizing invariant accesses looks very
>> suspicious - it generates a vector load at the scalar address
>> to then just extract the first vector element.  Huh.  IMHO this
>> can be simplified as done, by just re-using the scalar load result.
>> But maybe this code was supposed to deal with something entirely
>> different?
>>
>> This patch gives a 33% speedup to the phoronix himeno testcase
>> if you bump the maximum alias versioning checks we want to insert.
>>
>> I'm currently re-bootstrapping & testing this but an earlier version
>> was ok on x86_64-unknown-linux-gnu.
>>
>> 2011-06-29  Richard Guenther  
>>
>>        PR tree-optimization/46787
>>        * tree-data-ref.c (dr_address_invariant_p): Remove.
>>        (find_data_references_in_stmt): Invariant accesses are ok now.
>>        * tree-vect-stmts.c (vectorizable_load): Handle invariant
>>        loads.
>>        * tree-vect-data-refs.c (vect_analyze_data_ref_access): Allow
>>        invariant loads.
>>
>>        * gcc.dg/vect/vect-121.c: New testcase.
>>
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49628
>
>

It also caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49893


-- 
H.J.


Re: [PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Carrot Wei
According to Richard, -march=armv7-a is not required.

So the finally installed is:

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 176910)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-07-29  Wei Guozhi  
+
+   PR rtl-optimization/49799
+   * combine.c (make_compound_operation): Check if the bit field is valid
+   before change it to bit field extraction.
+
 2011-07-29  Bernd Schmidt  

PR rtl-optimization/49891
Index: gcc/testsuite/gcc.dg/pr49799.c
===
--- gcc/testsuite/gcc.dg/pr49799.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr49799.c  (revision 0)
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/49799 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -w" } */
+
+static __inline int bar(int a)
+{
+int tmp;
+
+if (a <= 0) a ^= 0x;
+
+return tmp - 1;
+}
+
+void foo(short *K)
+{
+short tmp;
+short *pptr, P[14];
+
+pptr = P;
+tmp = bar(*K);
+*pptr = (*K << tmp) >> 16;
+
+if (*P < tmp)
+*K++ = 0;
+}
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 176910)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2011-07-29  Wei Guozhi  
+
+   PR rtl-optimization/49799
+   * gcc.dg/pr49799.c: New test case.
+
 2011-07-22  Sebastian Pop  

PR middle-end/48648
Index: gcc/combine.c
===
--- gcc/combine.c   (revision 176910)
+++ gcc/combine.c   (working copy)
@@ -7787,6 +7787,7 @@ make_compound_operation (rtx x, enum rtx
  && GET_CODE (lhs) == ASHIFT
  && CONST_INT_P (XEXP (lhs, 1))
  && INTVAL (rhs) >= INTVAL (XEXP (lhs, 1))
+ && INTVAL (XEXP (lhs, 1)) >= 0
  && INTVAL (rhs) < mode_width)
{
  new_rtx = make_compound_operation (XEXP (lhs, 0), next_code);

thanks a lot.
Carrot

On Thu, Jul 28, 2011 at 4:47 PM, Jakub Jelinek  wrote:
> On Thu, Jul 28, 2011 at 04:40:53PM +0800, Carrot Wei wrote:
>> ChangeLog:
>> 2011-07-28  Wei Guozhi  
>>
>>         PR rtl-optimization/49799
>>         * pr49799.c : New test case.
>
> Space shouldn't be between .c and :.  And the filename should be
> relative to gcc/testsuite/ dir, so either gcc.target/arm/pr49799.c, or
> better gcc.dg/pr49799.c.
>
> Putting the testcase just into gcc.target/arm means it won't be tested
> on other targets, while there is nothing arm specific about the testcase
> except that you force -march in dg-options for arm.
> You can do that with
> /* PR rtl-optimization/49799 */
> /* { dg-do assemble } */
> /* { dg-options "-O2 -w" } */
> /* { dg-options "-O2 -w -march=armv7-a" { target arm*-*-* } } */
> or similar.
>
> Ok with those changes.
>
>        Jakub
>


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Thu, Jul 28, 2011 at 11:49:16AM -0700, Richard Henderson wrote:
> On 07/28/2011 12:27 AM, Alan Modra wrote:
> > On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
> >> Ideally what I'd like to
> >> do is have ld and gcc emit accurate r2 tracking unwind info and
> >> dispense with hacks like frob_update_context.  If ld did emit accurate
> >> unwind info for .glink, then the justification for frob_update_context
> >> disappears.
> > 
> > For the record, this statement of mine doesn't make sense.  A .glink
> > stub doesn't make a frame, so a backtrace won't normally pass through a
> > stub, thus having accurate unwind info for .glink doesn't help at all.
> 
> It does, for the duration of the stub.

Right, but I was talking about the normal case, where the unwinder
won't even look at .glink unwind info.

> The whole problem is that toc pointer copy in 40(1) is only valid
> during indirect call sequences, and iff ld inserted a stub?  I.e.
> direct calls between functions that share toc pointers never save
> the copy?

Yes.

> Would it make sense, if a function has any indirect call, to move
> the toc pointer save into the prologue?  You'd get to avoid that
> store all the time.  Of course you'd not be able to sink the load
> after the call, but it might still be a win.  And in that special
> case you can annotate the r2 save slot just once, correctly.

Except that any info about r2 in an indirect call sequence really
belongs to the *called* function frame, not the callee.  I woke up
this morning with the realization that what I'd done in
frob_update_context for indirect call sequences was wrong.  Ditto for
the r2 store that Michael moved into the prologue.  The only time we
want the unwinder to restore from that particular save is if r2 isn't
saved in the current frame.

Untested patch follows.

libgcc/
* config/rs6000/linux-unwind.h (frob_update_context <__powerpc64__>):
Restore for indirect call bcrtl from correct stack slot, and only
if cfa+40 isn't valid.
gcc/
* config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete.
* config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static.
(rs6000_emit_prologue): Don't emit eh_frame info in
save_toc_in_prologue case.
(rs6000_call_indirect_aix): Formatting.

Index: libgcc/config/rs6000/linux-unwind.h
===
--- libgcc/config/rs6000/linux-unwind.h (revision 176905)
+++ libgcc/config/rs6000/linux-unwind.h (working copy)
@@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont
  /* We are in a plt call stub or r2 adjusting long branch stub,
 before r2 has been saved.  Keep REG_UNSAVED.  */
}
-  else if (pc[0] == 0x4E800421
-  && pc[1] == 0xE8410028)
-   {
- /* We are at the bctrl instruction in a call via function
-pointer.  gcc always emits the load of the new r2 just
-before the bctrl.  */
- _Unwind_SetGRPtr (context, 2, context->cfa + 40);
-   }
   else
{
  unsigned int *insn
= (unsigned int *) _Unwind_GetGR (context, R_LR);
  if (insn && *insn == 0xE8410028)
_Unwind_SetGRPtr (context, 2, context->cfa + 40);
+ else if (pc[0] == 0x4E800421
+  && pc[1] == 0xE8410028)
+   {
+ /* We are at the bctrl instruction in a call via function
+pointer.  gcc always emits the load of the new R2 just
+before the bctrl so this is the first and only place
+we need to use the stored R2.  */
+ _Unwind_Word sp = _Unwind_GetGR (context, 1);
+ _Unwind_SetGRPtr (context, 2, sp + 40);
+   }
}
 }
 #endif
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 176905)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
 extern void rs6000_call_indirect_aix (rtx, rtx, rtx);
-extern bool rs6000_save_toc_in_prologue_p (void);
-
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 176905)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_
 static void rs6000_trampoline_init (rtx, tree, rtx);
 static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx);
 static bool rs6000_legitimate_constant_p (enum machine_mode, rtx);
+static bool rs6000_save_toc_in_prologue_p (void);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -20536,8 +20562,11 

Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 4:25 PM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 3:46 PM, H.J. Lu  wrote:
>> On Thu, Jul 28, 2011 at 3:40 PM, Uros Bizjak  wrote:
>>> On Fri, Jul 29, 2011 at 12:28 AM, H.J. Lu  wrote:
>>>
 TP is 32bit in x32  For load_tp_x32, we load SImode value and
 zero-extend to DImode. For add_tp_x32, we are adding SImode
 value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
 must take SImode TP.
>
>> Here is the revised patch.  The difference is I changed *add_tp_x32 to 
>> SImode.
>> For
>>
>> ---
>> extern __thread int __libc_errno __attribute__ ((tls_model 
>> ("initial-exec")));
>>
>> int *
>> __errno_location (void)
>> {
>>  return &__libc_errno;
>> }
>> ---
>>
>> compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:
>>
>>        movq    __libc_errno@gottpoff(%rip), %rax
>>        addl    %fs:0, %eax
>>        mov     %eax, %eax
>>        ret
>>
>> SImode *add_tp_x32 generates:
>>
>>        movl    %fs:0, %eax
>>        addl    __libc_errno@gottpoff(%rip), %eax
>>        ret
>
> This happens because combine can't combine DImode load and SImode plus
> RTXes. These RTXes have to be in Pmode, see the intention in
> legitimize_tls_address, also for TARGET_GNU2_TLS.
>
> Can you please debug what goes wrong with tp_add_x32 in DImode?
>

 We start with
>>>
>>> Uh, we didn't understand each other... can you please debug what goes
>>> wrong with glibc runtime test?
>>>
>>
>> I haven't be able to reproduce it.  I may have a typo when I fixed the
>> pattern.  I will try again.
>>
>
> I can't reproduce it any more.
>

I know why. I also have

(*load_tp_): Disabled for TARGET_X32.
(*add_tp_): Likewise.

since the SImode version is wrong for x32 due to

(define_mode_attr tp_seg [(SI "gs") (DI "fs")])

x32 uses fs even if it is 32bit.

-- 
H.J.


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Jason Merrill

On 07/28/2011 04:40 AM, Richard Guenther wrote:

field).  Now, the question is of course what to do for DECL_PACKED
fields (I suppose, simply ignore the C++ memory model as C++ doesn't
have a notion of packed or specially (mis-)aligned structs or bitfields).


I think treat them as bitfields for this purpose.

Jason



Re: [pph] Free buffers used during tree encoding/decoding

2011-07-28 Thread Lawrence Crowl
I'm getting massive failures after incorporating this change:

   bytecode stream: trying to read 1735 bytes after the end of the
   input buffer

where the number of bytes changes.  Suggestions?


On 7/28/11, Diego Novillo  wrote:
> Noticed this while debugging the new tree encoding cache.  No
> functional changes.  This frees the memory used by the buffers
> used during tree streaming.  It also moves the reader and writer
> data into a union to better distinguish them.
>
> Tested on x86_64.
>
>
> Diego.
>
>
>   * pph-streamer.h (pph_stream): Move fields OB, OUT_STATE,
>   DECL_STATE_STREAM, IB, DATA_IN, PPH_SECTIONS, FILE_DATA and
>   FILE_SIZE into a union of structures.
>   Update all users.
>   * pph-streamer.c (pph_stream_close): Free memory used by tree
>   encoding routines.
>
> Index: cp/pph-streamer-in.c
> ===
> --- cp/pph-streamer-in.c  (revision 176879)
> +++ cp/pph-streamer-in.c  (working copy)
> @@ -109,8 +109,8 @@ pph_get_section_data (struct lto_file_de
>  {
>/* FIXME pph - Stop abusing lto_file_decl_data fields.  */
>const pph_stream *stream = (const pph_stream *) file_data->file_name;
> -  *len = stream->file_size - sizeof (pph_file_header);
> -  return (const char *) stream->file_data + sizeof (pph_file_header);
> +  *len = stream->encoder.r.file_size - sizeof (pph_file_header);
> +  return (const char *) stream->encoder.r.file_data + sizeof
> (pph_file_header);
>  }
>
>
> @@ -119,14 +119,14 @@ pph_get_section_data (struct lto_file_de
>
>  static void
>  pph_free_section_data (struct lto_file_decl_data *file_data,
> -enum lto_section_type section_type ATTRIBUTE_UNUSED,
> -const char *name ATTRIBUTE_UNUSED,
> -const char *offset ATTRIBUTE_UNUSED,
> -size_t len ATTRIBUTE_UNUSED)
> +enum lto_section_type section_type ATTRIBUTE_UNUSED,
> +const char *name ATTRIBUTE_UNUSED,
> +const char *offset ATTRIBUTE_UNUSED,
> +size_t len ATTRIBUTE_UNUSED)
>  {
>/* FIXME pph - Stop abusing lto_file_decl_data fields.  */
>const pph_stream *stream = (const pph_stream *) file_data->file_name;
> -  free (stream->file_data);
> +  free (stream->encoder.r.file_data);
>  }
>
>
> @@ -145,46 +145,48 @@ pph_init_read (pph_stream *stream)
>
>lto_reader_init ();
>
> -  /* Read STREAM->NAME into the memory buffer STREAM->FILE_DATA.
> - FIXME pph, we are reading the whole file at once.  This seems
> - wasteful.  */
> +  /* Read STREAM->NAME into the memory buffer stream->encoder.r.file_data.
> */
>retcode = fstat (fileno (stream->file), &st);
>gcc_assert (retcode == 0);
> -  stream->file_size = (size_t) st.st_size;
> -  stream->file_data = XCNEWVEC (char, stream->file_size);
> -  bytes_read = fread (stream->file_data, 1, stream->file_size,
> stream->file);
> -  gcc_assert (bytes_read == stream->file_size);
> +  stream->encoder.r.file_size = (size_t) st.st_size;
> +  stream->encoder.r.file_data = XCNEWVEC (char,
> stream->encoder.r.file_size);
> +  bytes_read = fread (stream->encoder.r.file_data, 1,
> +   stream->encoder.r.file_size, stream->file);
> +  gcc_assert (bytes_read == stream->encoder.r.file_size);
>
>/* Set LTO callbacks to read the PPH file.  */
> -  stream->pph_sections = XCNEWVEC (struct lto_file_decl_data *,
> -PPH_NUM_SECTIONS);
> +  stream->encoder.r.pph_sections = XCNEWVEC (struct lto_file_decl_data *,
> +  PPH_NUM_SECTIONS);
>for (i = 0; i < PPH_NUM_SECTIONS; i++)
>  {
> -  stream->pph_sections[i] = XCNEW (struct lto_file_decl_data);
> +  stream->encoder.r.pph_sections[i] = XCNEW (struct
> lto_file_decl_data);
>/* FIXME pph - Stop abusing fields in lto_file_decl_data.  */
> -  stream->pph_sections[i]->file_name = (const char *) stream;
> +  stream->encoder.r.pph_sections[i]->file_name = (const char *) stream;
>  }
>
> -  lto_set_in_hooks (stream->pph_sections, pph_get_section_data,
> +  lto_set_in_hooks (stream->encoder.r.pph_sections, pph_get_section_data,
>   pph_free_section_data);
>
> -  header = (pph_file_header *) stream->file_data;
> +  header = (pph_file_header *) stream->encoder.r.file_data;
>strtab = (const char *) header + sizeof (pph_file_header);
>strtab_size = header->strtab_size;
>body = strtab + strtab_size;
> -  gcc_assert (stream->file_size >= strtab_size + sizeof (pph_file_header));
> -  body_size = stream->file_size - strtab_size - sizeof (pph_file_header);
> +  gcc_assert (stream->encoder.r.file_size
> +   >= strtab_size + sizeof (pph_file_header));
> +  body_size = stream->encoder.r.file_size
> +   - strtab_size - sizeof (pph_file_header);
>
>/* Create an input block structure pointing right after the string
>   table.  */
> -  stre

Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 3:46 PM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 3:40 PM, Uros Bizjak  wrote:
>> On Fri, Jul 29, 2011 at 12:28 AM, H.J. Lu  wrote:
>>
>>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>>> must take SImode TP.

> Here is the revised patch.  The difference is I changed *add_tp_x32 to 
> SImode.
> For
>
> ---
> extern __thread int __libc_errno __attribute__ ((tls_model 
> ("initial-exec")));
>
> int *
> __errno_location (void)
> {
>  return &__libc_errno;
> }
> ---
>
> compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:
>
>        movq    __libc_errno@gottpoff(%rip), %rax
>        addl    %fs:0, %eax
>        mov     %eax, %eax
>        ret
>
> SImode *add_tp_x32 generates:
>
>        movl    %fs:0, %eax
>        addl    __libc_errno@gottpoff(%rip), %eax
>        ret

 This happens because combine can't combine DImode load and SImode plus
 RTXes. These RTXes have to be in Pmode, see the intention in
 legitimize_tls_address, also for TARGET_GNU2_TLS.

 Can you please debug what goes wrong with tp_add_x32 in DImode?

>>>
>>> We start with
>>
>> Uh, we didn't understand each other... can you please debug what goes
>> wrong with glibc runtime test?
>>
>
> I haven't be able to reproduce it.  I may have a typo when I fixed the
> pattern.  I will try again.
>

I can't reproduce it any more.


-- 
H.J.


[pph] Put tinst_level list in forward order (issue4823059)

2011-07-28 Thread Lawrence Crowl
Place the tinst_level list in forward order.

Tested on x64.


Index: gcc/cp/ChangeLog.pph

2011-07-28   Lawrence Crowl  

* pt.c (pph_in_tinst_level): Put tinst_level list in forward order.


Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c (revision 176905)
+++ gcc/cp/pt.c (working copy)
@@ -19738,20 +19738,24 @@ pph_dump_tinst_level (FILE *stream, stru
 static struct tinst_level *
 pph_in_tinst_level (pph_stream *stream)
 {
+  struct tinst_level *head = NULL;
   struct tinst_level *last = NULL;
   unsigned count = pph_in_uint (stream);
-  /* FIXME pph: This leaves the list in reverse order.  Issue?  */
   for (; count > 0; --count)
 {
   struct tinst_level *cur = ggc_alloc_tinst_level ();
-  cur->next = last;
+  cur->next = NULL;
   cur->decl = pph_in_tree (stream);
   cur->locus = pph_in_location (stream);
   cur->errors = pph_in_uint (stream);
   cur->in_system_header_p = pph_in_uint (stream);
+  if (head == NULL)
+  head = cur;
+  else
+  last->next = cur;
   last = cur;
 }
-  return last;
+  return head;
 }
 
 

--
This patch is available for review at http://codereview.appspot.com/4823059


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 3:40 PM, Uros Bizjak  wrote:
> On Fri, Jul 29, 2011 at 12:28 AM, H.J. Lu  wrote:
>
>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>> must take SImode TP.
>>>
 Here is the revised patch.  The difference is I changed *add_tp_x32 to 
 SImode.
 For

 ---
 extern __thread int __libc_errno __attribute__ ((tls_model 
 ("initial-exec")));

 int *
 __errno_location (void)
 {
  return &__libc_errno;
 }
 ---

 compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:

        movq    __libc_errno@gottpoff(%rip), %rax
        addl    %fs:0, %eax
        mov     %eax, %eax
        ret

 SImode *add_tp_x32 generates:

        movl    %fs:0, %eax
        addl    __libc_errno@gottpoff(%rip), %eax
        ret
>>>
>>> This happens because combine can't combine DImode load and SImode plus
>>> RTXes. These RTXes have to be in Pmode, see the intention in
>>> legitimize_tls_address, also for TARGET_GNU2_TLS.
>>>
>>> Can you please debug what goes wrong with tp_add_x32 in DImode?
>>>
>>
>> We start with
>
> Uh, we didn't understand each other... can you please debug what goes
> wrong with glibc runtime test?
>

I haven't be able to reproduce it.  I may have a typo when I fixed the
pattern.  I will try again.



-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Fri, Jul 29, 2011 at 12:28 AM, H.J. Lu  wrote:

> TP is 32bit in x32  For load_tp_x32, we load SImode value and
> zero-extend to DImode. For add_tp_x32, we are adding SImode
> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
> must take SImode TP.
>>
>>> Here is the revised patch.  The difference is I changed *add_tp_x32 to 
>>> SImode.
>>> For
>>>
>>> ---
>>> extern __thread int __libc_errno __attribute__ ((tls_model 
>>> ("initial-exec")));
>>>
>>> int *
>>> __errno_location (void)
>>> {
>>>  return &__libc_errno;
>>> }
>>> ---
>>>
>>> compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:
>>>
>>>        movq    __libc_errno@gottpoff(%rip), %rax
>>>        addl    %fs:0, %eax
>>>        mov     %eax, %eax
>>>        ret
>>>
>>> SImode *add_tp_x32 generates:
>>>
>>>        movl    %fs:0, %eax
>>>        addl    __libc_errno@gottpoff(%rip), %eax
>>>        ret
>>
>> This happens because combine can't combine DImode load and SImode plus
>> RTXes. These RTXes have to be in Pmode, see the intention in
>> legitimize_tls_address, also for TARGET_GNU2_TLS.
>>
>> Can you please debug what goes wrong with tp_add_x32 in DImode?
>>
>
> We start with

Uh, we didn't understand each other... can you please debug what goes
wrong with glibc runtime test?

Thanks,
Uros.


Re: [PATCH 4/6] Shrink-wrapping

2011-07-28 Thread Richard Earnshaw
On 28/07/11 11:35, Bernd Schmidt wrote:
> On 07/21/11 11:52, Richard Sandiford wrote:
>> The name "active_insn_after" seems a bit too similar to "next_active_insn"
>> for the difference to be obvious.  How about something like
>> "first_active_target_insn" instead?
> 
> Changed.
>>> -  for (; insn; insn = next)
>>> +  for (; insn && !ANY_RETURN_P (insn); insn = next)
>>>  {
>>>if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE)
>>> insn = XVECEXP (PATTERN (insn), 0, 0);
>>
>> Since ANY_RETURN looks for patterns, while this loop iterates over insns,
>> I think it'd be more obvious to have:
>>
>>   if (insn && ANY_RETURN_P (insn))
>> return 1;
>>
>> above the loop instead
> 
> That alone wouldn't work since we assign JUMP_LABELs to next. Left alone
> for now.
> 
>>> --- gcc/jump.c  (revision 176230)
>>> +++ gcc/jump.c  (working copy)
>>> @@ -1217,7 +1217,7 @@ delete_related_insns (rtx insn)
>>
>> Given what you said above, and given that this is a public function,
>> I think we should keep the null check.
> 
> Changed.
>>
>> This pattern came up in reorg.c too, so maybe it would be worth having
>> a jump_to_label_p inline function somewhere, such as:
> 
> Done. Only has two uses for now though; reorg.c uses different patterns
> mostly.
> 
>> It looks like the old code tried to allow returns to be redirected
>> to a label -- (return) to (set (pc) (label_ref)) -- whereas the new
>> code doesn't. [...]
>>
>> How about:
>>
>>   x = redirect_target (nlabel);
>>   if (GET_CODE (x) == LABEL_REF && loc == &PATTERN (insn))
>>  x = gen_rtx_SET (VOIDmode, pc_rtx, x);
>>   validate_change (insn, loc, x, 1);
> 
> Changed, although this probably isn't a useful thing to allow (it will
> just add one more unnecessary jump to the code?).
> 
> [ifcvt changes]
>> I found the placement of this code a bit confusing as things stand.
>> new_dest_label is only meaningful if other_bb != new_dest, so it seemed
>> like something that should directly replace the existing new_label
>> assignment.  It's OK if it makes the shrink-wrap stuff easier though.
> 
> Changed.
> 
>>> @@ -1195,6 +1195,9 @@ duplicate_insn_chain (rtx from, rtx to)
>>>   break;
>>> }
>>>   copy = emit_copy_of_insn_after (insn, get_last_insn ());
>>> + if (JUMP_P (insn) && JUMP_LABEL (insn) != NULL_RTX
>>> + && ANY_RETURN_P (JUMP_LABEL (insn)))
>>> +   JUMP_LABEL (copy) = JUMP_LABEL (insn);
>>
>> I think this should go in emit_copy_of_insn_after instead.
> 
> Here I'd like to avoid modifying the existing code in
> emit_copy_of_insn_after if possible. Not sure why it's not copying
> JUMP_LABELS, but that's something I'd prefer to investigate at some
> other time rather than risk breaking things.
> 
>>> @@ -2294,6 +2294,8 @@ create_cfi_notes (void)
>>>   dwarf2out_frame_debug (insn, false);
>>>   continue;
>>> }
>>> +  if (GET_CODE (pat) == ADDR_VEC || GET_CODE (pat) == ADDR_DIFF_VEC)
>>> +   continue;
>>>  
>>>if (GET_CODE (pat) == SEQUENCE)
>>> {
>>
>> rth better approve this bit...
> 
> It went away.
> 
> New patch below. Retested on i686-linux and mips64-elf. Ok?
> 
> 
> Bernd


This causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49891

R.




Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 3:03 PM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 10:15 PM, H.J. Lu  wrote:
>
 TP is 32bit in x32  For load_tp_x32, we load SImode value and
 zero-extend to DImode. For add_tp_x32, we are adding SImode
 value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
 must take SImode TP.
>
>> Here is the revised patch.  The difference is I changed *add_tp_x32 to 
>> SImode.
>> For
>>
>> ---
>> extern __thread int __libc_errno __attribute__ ((tls_model 
>> ("initial-exec")));
>>
>> int *
>> __errno_location (void)
>> {
>>  return &__libc_errno;
>> }
>> ---
>>
>> compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:
>>
>>        movq    __libc_errno@gottpoff(%rip), %rax
>>        addl    %fs:0, %eax
>>        mov     %eax, %eax
>>        ret
>>
>> SImode *add_tp_x32 generates:
>>
>>        movl    %fs:0, %eax
>>        addl    __libc_errno@gottpoff(%rip), %eax
>>        ret
>
> This happens because combine can't combine DImode load and SImode plus
> RTXes. These RTXes have to be in Pmode, see the intention in
> legitimize_tls_address, also for TARGET_GNU2_TLS.
>
> Can you please debug what goes wrong with tp_add_x32 in DImode?
>

We start with

insn 5 2 6 2 (set (reg:DI 63)
(unspec:DI [
(const_int 0 [0])
] UNSPEC_TP)) err.i:6 716 {*load_tp_x32}
 (nil))

(insn 6 5 7 2 (set (reg:DI 64)
(mem/u/c:DI (const:DI (unspec:DI [
(symbol_ref:SI ("__libc_errno") [flags 0x60]  )
] UNSPEC_GOTNTPOFF)) [0 S8 A8])) err.i:6 62 {*movdi_internal
_rex64}
 (nil))

(insn 7 6 8 2 (parallel [
(set (reg:DI 65)
(plus:DI (reg:DI 63)
(reg:DI 64)))
(clobber (reg:CC 17 flags))
]) err.i:6 251 {*adddi_1}
 (expr_list:REG_DEAD (reg:DI 64)
(expr_list:REG_DEAD (reg:DI 63)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)

With DImode *add_tp_x32, combine matches

(parallel [
(set (reg:DI 65)
(plus:DI (unspec:DI [
(const_int 0 [0])
] UNSPEC_TP)
(reg:DI 64)))
(clobber (reg:CC 17 flags))
])

first.  It turns out that it isn't a very good choice since
it prevents combine to try

(parallel [
(set (reg:DI 65)
(plus:DI (reg:DI 63)
(mem/u/c:DI (const:DI (unspec:DI [
(symbol_ref:SI ("__libc_errno") [flags
0x60]  )
] UNSPEC_GOTNTPOFF)) [0 S8 A8])))
(clobber (reg:CC 17 flags))
])

I am not sure how useful *add_tp_XXX pattern is.


-- 
H.J.


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread DJ Delorie

Thanks!  Committed.


Re: [pph] Save pending and specialized templates (issue4814054)

2011-07-28 Thread Lawrence Crowl
On 7/26/11, Gabriel Charette  wrote:
>> +/* Load a tinst_level list.  */
>> +
>> +static struct tinst_level *
>> +pph_in_tinst_level (pph_stream *stream)
>> +{
>> +  struct tinst_level *last = NULL;
>> +  unsigned count = pph_in_uint (stream);
>> +  /* FIXME pph: This leaves the list in reverse order.  Issue?  */
>> +  for (; count > 0; --count)
>> +{
>> +  struct tinst_level *cur = ggc_alloc_tinst_level ();
>> +  cur->next = last;
>
> -  cur->next = last;
> + if(last)
> +   last->next = cur;
>
>> +  cur->decl = pph_in_tree (stream);
>> +  cur->locus = pph_in_location (stream);
>> +  cur->errors = pph_in_uint (stream);
>> +  cur->in_system_header_p = pph_in_uint (stream);
>> +  last = cur;
>> +}
>
> if (last)
>   last->next = NULL;
>
>> +  return last;
>> +}
>> +
>
> If you do the changes above, the list won't be in reverse order.

That code loses the pointer to the head of the list.  However, I
take your point and will avoid the comment by doing the forward list.

> Also if you do it as I did in pph_in_cxx_binding, using the cache
> markers, you do need to output count, you can simply use the cache
> mechanism which will know how to output/input NULL (I used that in
> pph_in/out_cxx_binding).

I think I like the count better in this instance.

>> +/* Load and merge a spec_entry TABLE from STREAM.  */
>> +
>> +static void
>> +pph_in_spec_entry_htab (pph_stream *stream, htab_t *table)
>> +{
>> +  spec_entry **slot = NULL;
>
> This variable is shadowed by
>
>> +  unsigned count = pph_in_uint (stream);
>> +  if (flag_pph_debug >= 2)
>> +fprintf (stderr, "loading %d spec_entries\n", count );
>> +  for (; count > 0; --count)
>> +{
>> +  hashval_t hash;
>> +  spec_entry **slot;
>
> ...this one??
>
>> +  struct spec_entry *se = ggc_alloc_spec_entry ();
>> +  se->tmpl = pph_in_tree (stream);
>> +  se->args = pph_in_tree (stream);
>> +  se->spec = pph_in_tree (stream);
>> +  hash = hash_specialization (se);
>> +  slot = htab_find_slot_with_hash (*table, se, hash, INSERT);
>> +  *slot = se;
>> +}
>> +}

I must have removed that earlier.

>> +/* Read and return a location_t from STREAM.  */
>> +static inline location_t
>> +pph_in_location (pph_stream *stream)
>> +{
>> +  location_t loc = lto_input_location (stream->ib, stream->data_in);
>> +  if (flag_pph_tracer >= 4)
>> +pph_trace_location (stream, loc);
>> +  return loc;
>> +}
>> +
>
> I was actually going to create pph_in/out_location. I
> will need to make them a streamer_hook so that all calls to
> lto_input/output_location actually are redirected to those when
> lto wants to in/out a source_location. Of course it would no
> longer call lto_input/output_location, rather it would output
> the source_location directly, and apply the calculated offset
> (discussed in "Linemap and pph") on the way in. It's great that
> you implemented a tracer for this as it will be easy for me to
> debug my implementation of this, thanks!
>
> Is there anything you implementation depends on I should be careful
> about, from what I see, it doesn't look like it (as long as the
> source_location you get back from pph_in_location are correct).

No particular dependences.  The templates needed locations but
weren't in trees.  They aren't doing anything special with them
that I saw.

>> +extern void lto_output_location (struct output_block *ob, location_t loc);
>
> lto_input/ouput will again no longer need to be extern when I make
> my patch for pph to handle those differently, I will make sure
> (remind me if I don't) to remove this change.

Okay.

-- 
Lawrence Crowl


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 10:15 PM, H.J. Lu  wrote:

>>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>>> must take SImode TP.

> Here is the revised patch.  The difference is I changed *add_tp_x32 to SImode.
> For
>
> ---
> extern __thread int __libc_errno __attribute__ ((tls_model ("initial-exec")));
>
> int *
> __errno_location (void)
> {
>  return &__libc_errno;
> }
> ---
>
> compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:
>
>        movq    __libc_errno@gottpoff(%rip), %rax
>        addl    %fs:0, %eax
>        mov     %eax, %eax
>        ret
>
> SImode *add_tp_x32 generates:
>
>        movl    %fs:0, %eax
>        addl    __libc_errno@gottpoff(%rip), %eax
>        ret

This happens because combine can't combine DImode load and SImode plus
RTXes. These RTXes have to be in Pmode, see the intention in
legitimize_tls_address, also for TARGET_GNU2_TLS.

Can you please debug what goes wrong with tp_add_x32 in DImode?

Uros.


Re: PING: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD > sizeof (void *)

2011-07-28 Thread H.J. Lu
Hi Richard, Jason,

Is this patch

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02401.html

OK for trunk?

Thanks.


H.J.
On Mon, Jul 11, 2011 at 3:21 PM, H.J. Lu  wrote:
> Ping.
>
> On Wed, Jul 6, 2011 at 2:20 PM, H.J. Lu  wrote:
>> PING.
>>
>> On Thu, Jun 30, 2011 at 1:47 PM, H.J. Lu  wrote:
>>> On Thu, Jun 30, 2011 at 12:02 PM, Richard Henderson  wrote:
 On 06/30/2011 11:23 AM, H.J. Lu wrote:
> +#ifdef REG_VALUE_IN_UNWIND_CONTEXT
> +typedef _Unwind_Word _Unwind_Context_Reg_Val;
> +/* Signal frame context.  */
> +#define SIGNAL_FRAME_BIT ((_Unwind_Word) 1 >> 0)

 There's absolutely no reason to re-define this.
 So what if the value is most-significant-bit set?

 Nor do I see any reason not to continue setting E_C_B.
>>>
>>> Done.
>>>
> +#define _Unwind_IsExtendedContext(c) 1

 Why is this not still an inline function?
>>>
>>> It is defined before _Unwind_Context is declared.  I used
>>> macros so that there can be one less "#ifdef".
>>>
> +
> +static inline _Unwind_Word
> +_Unwind_Get_Unwind_Word (_Unwind_Context_Reg_Val val)
> +{
> +  return val;
> +}
> +
> +static inline _Unwind_Context_Reg_Val
> +_Unwind_Get_Unwind_Context_Reg_Val (_Unwind_Word val)
> +{
> +  return val;
> +}

 I cannot believe this actually works.  I see nowhere that
 you copy the by-address slot out of the stack frame and
 place it into the by-value slot in the unwind context.
>>>
>>> I changed the implantation based on the feedback from
>>> Jason.  Now I use the same reg field for both value and
>>> address.
>>>
>    /* This will segfault if the register hasn't been saved.  */
>    if (size == sizeof(_Unwind_Ptr))
> -    return * (_Unwind_Ptr *) ptr;
> +    return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val;
>    else
>      {
>        gcc_assert (size == sizeof(_Unwind_Word));
> -      return * (_Unwind_Word *) ptr;
> +      return * (_Unwind_Word *) (_Unwind_Internal_Ptr) val;
>      }

 Indeed, this section is both wrong and belies the change
 you purport to make.

 You didn't even test this, did you?

>>>
>>> Here is the updated patch.  It works on simple tests.
>>> I am running full tests.  I kept config/i386/value-unwind.h
>>> since libgcc/md-unwind-support.h is included too late
>>> in unwind-dw2.c and I don't want to move it to be on
>>> the safe side.
>>>
>>> OK for trunk?
>>>
>>> Thanks.
>>>
>>> --
>>> H.J.
>>> ---
>>> gcc/
>>>
>>> 2011-06-30  H.J. Lu  
>>>
>>>        * config.gcc (libgcc_tm_file): Add i386/value-unwind.h for
>>>        Linux/x86.
>>>
>>>        * system.h (REG_VALUE_IN_UNWIND_CONTEXT): Poisoned.
>>>
>>>        * unwind-dw2.c (_Unwind_Context_Reg_Val): New.
>>>        (_Unwind_Get_Unwind_Word): Likewise.
>>>        (_Unwind_Get_Unwind_Context_Reg_Val): Likewise.
>>>        (_Unwind_Context): Use _Unwind_Context_Reg_Val on the reg field.
>>>        (_Unwind_IsExtendedContext): Defined as macro.
>>>        (_Unwind_GetGR): Updated.
>>>        (_Unwind_SetGR): Likewise.
>>>        (_Unwind_GetGRPtr): Likewise.
>>>        (_Unwind_SetGRPtr): Likewise.
>>>        (_Unwind_SetGRValue): Likewise.
>>>        (_Unwind_GRByValue): Likewise.
>>>        (__frame_state_for): Likewise.
>>>        (uw_install_context_1): Likewise.
>>>
>>>        * doc/tm.texi.in: Document REG_VALUE_IN_UNWIND_CONTEXT.
>>>        * doc/tm.texi: Regenerated.
>>>
>>> libgcc/
>>>
>>> 2011-06-30  H.J. Lu  
>>>
>>>        * config/i386/value-unwind.h: New.
>>>
>>
>>


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread Richard Henderson
On 07/28/2011 01:30 PM, DJ Delorie wrote:
> Fails the same way with a parameter as with a local:
> 
> int result;
> void __attribute__((naked)) ISRFunction(int x) {
>result = subFunction(&x);
> }
> 
> but I'm certainly open to a better explanation of how a user program
> can trigger an ICE that way.
> 
> Hmmm... the only use of targetm.calls.allocate_stack_slots_for_args()
> is in use_register_for_decl() - which is used by expand_decl() for
> automatic variables.
> 
> Maybe the hook should have been called allocate_stack_slots_for_decls() ?

Probably.  Thanks for the leg-work.  I'll approve the patch as-is.


r~


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Aldy Hernandez



I believe that any after-the-fact attempt to recover bitfield boundaries is
going to fail unless you preserve more information during bitfield layout.

Consider

struct {
   char : 8;
   char : 0;
   char : 8;
};

where the : 0 isn't preserved in any way and you can't distinguish
it from struct { char : 8; char : 8; }.


Huh?  In my tests the :0 is preserved, it just doesn't have a DECL_NAME.

(gdb) p fld
$41 = (tree) 0x77778130
(gdb) pt
 I have tried the following scenario, and we calculate the beginning of 
the bit region correctly (bit 32).


struct bits
{
  char a;
  int b:7;
  int :0;   <-- bitregion start
  int c:9;  <-- bitregion start
  unsigned char d;
} *p;

void foo() { p -> c = 55; }

Am I misunderstanding?  Why do you suggest we need to preserve more 
information during bitfield layout?


FWIW, I should add a zero-length bit test.


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Steve Ellcey
On Thu, 2011-07-28 at 22:46 +0200, Paolo Carlini wrote:
> Hi,
> 
> > I ran into this problem on the IA64 HP-UX bootstrap and tried your
> > patch to semantics.c.  It fixed the problem for me and I was able
> > to bootstrap.
> 
> Thanks Steve. I say let's commit it, I'm pretty sure it's by and large 
> correct and anyway it
> unbreaks the bootstrap for many people. Can you take care of that? I'm 
> traveling and can only
> do it in a few hours.
> 
> Thanks,
> Paolo

OK, I went ahead and checked it in.  Here is the ChangeLog I put in:



2011-07-28  Paolo Carlini  

PR c++/49813
* semantics.c (potential_constant_expression_1):  Handle FMA_EXPR.

Steve Ellcey
s...@cup.hp.com



Re: [PATCH] Optimize minloc (rank1array) and maxloc (rank1array) (PR fortran/31067)

2011-07-28 Thread Tobias Burnus

Jakub Jelinek wrote:

The minloc/maxloc intrinsics without DIM argument used on rank 1 arrays
aren't inline optimized, because unlike their DIM=1 counterparts they
return a rank 1 array with size 1.  Handling that during genericization
looked too hard to me, so instead this patch optimizes
minloc (a) for rank 1 a into (/ minloc (a, dim=1) /) during frontend 
optimizations.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK - with one nit fixed: Please update the comment to reflect the 
current code. (No longer only assignments, uses array constructor.)



+/* Optimize a = minloc(b), where b is rank 1 array, into
+   a = minloc(b, dim=1), and similarly for maxloc,
+   as the latter forms are expanded inline.  */


Thanks for the patch!

Tobias


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Paolo Carlini
Hi,

> I ran into this problem on the IA64 HP-UX bootstrap and tried your
> patch to semantics.c.  It fixed the problem for me and I was able
> to bootstrap.

Thanks Steve. I say let's commit it, I'm pretty sure it's by and large correct 
and anyway it unbreaks the bootstrap for many people. Can you take care of 
that? I'm traveling and can only do it in a few hours.

Thanks,
Paolo


[PATCH] Optimize minloc (rank1array) and maxloc (rank1array) (PR fortran/31067)

2011-07-28 Thread Jakub Jelinek
Hi!

The minloc/maxloc intrinsics without DIM argument used on rank 1 arrays
aren't inline optimized, because unlike their DIM=1 counterparts they
return a rank 1 array with size 1.  Handling that during genericization
looked too hard to me, so instead this patch optimizes
minloc (a)
for rank 1 a into (/ minloc (a, dim=1) /) during frontend optimizations.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-07-28  Jakub Jelinek  

PR fortran/31067
* frontend-passes.c (optimize_minmaxloc): New function.
(optimize_expr): Call it.

* gfortran.dg/maxloc_2.f90: New test.
* gfortran.dg/maxloc_3.f90: New test.
* gfortran.dg/minloc_1.f90: New test.
* gfortran.dg/minloc_2.f90: New test.
* gfortran.dg/minloc_3.f90: New test.
* gfortran.dg/minmaxloc_7.f90: New test.

--- gcc/fortran/frontend-passes.c.jj2011-06-17 11:02:01.0 +0200
+++ gcc/fortran/frontend-passes.c   2011-07-28 19:08:52.0 +0200
@@ -1,5 +1,5 @@
 /* Pass manager for Fortran front end.
-   Copyright (C) 2010 Free Software Foundation, Inc.
+   Copyright (C) 2010, 2011 Free Software Foundation, Inc.
Contributed by Thomas König.
 
 This file is part of GCC.
@@ -36,6 +36,7 @@ static bool optimize_op (gfc_expr *);
 static bool optimize_comparison (gfc_expr *, gfc_intrinsic_op);
 static bool optimize_trim (gfc_expr *);
 static bool optimize_lexical_comparison (gfc_expr *);
+static void optimize_minmaxloc (gfc_expr **);
 
 /* How deep we are inside an argument list.  */
 
@@ -129,6 +130,17 @@ optimize_expr (gfc_expr **e, int *walk_s
   if ((*e)->expr_type == EXPR_OP && optimize_op (*e))
 gfc_simplify_expr (*e, 0);
 
+  if ((*e)->expr_type == EXPR_FUNCTION && (*e)->value.function.isym)
+switch ((*e)->value.function.isym->id)
+  {
+  case GFC_ISYM_MINLOC:
+  case GFC_ISYM_MAXLOC:
+   optimize_minmaxloc (e);
+   break;
+  default:
+   break;
+  }
+
   if (function_expr)
 count_arglist --;
 
@@ -862,6 +875,49 @@ optimize_trim (gfc_expr *e)
   return true;
 }
 
+/* Optimize a = minloc(b), where b is rank 1 array, into
+   a = minloc(b, dim=1), and similarly for maxloc,
+   as the latter forms are expanded inline.  */
+
+static void
+optimize_minmaxloc (gfc_expr **e)
+{
+  gfc_expr *fn = *e;
+  gfc_actual_arglist *a;
+  char *name, *p;
+
+  if (fn->rank != 1
+  || fn->value.function.actual == NULL
+  || fn->value.function.actual->expr == NULL
+  || fn->value.function.actual->expr->rank != 1)
+return;
+
+  *e = gfc_get_array_expr (fn->ts.type, fn->ts.kind, &fn->where);
+  (*e)->shape = fn->shape;
+  fn->rank = 0;
+  fn->shape = NULL;
+  gfc_constructor_append_expr (&(*e)->value.constructor, fn, &fn->where);
+
+  name = XALLOCAVEC (char, strlen (fn->value.function.name) + 1);
+  strcpy (name, fn->value.function.name);
+  p = strstr (name, "loc0");
+  p[3] = '1';
+  fn->value.function.name = gfc_get_string (name);
+  if (fn->value.function.actual->next)
+{
+  a = fn->value.function.actual->next;
+  gcc_assert (a->expr == NULL);
+}
+  else
+{
+  a = gfc_get_actual_arglist ();
+  fn->value.function.actual->next = a;
+}
+  a->expr = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
+  &fn->where);
+  mpz_set_ui (a->expr->value.integer, 1);
+}
+
 #define WALK_SUBEXPR(NODE) \
   do   \
 {  \
--- gcc/testsuite/gfortran.dg/maxloc_2.f90.jj   2011-07-28 19:52:05.0 
+0200
+++ gcc/testsuite/gfortran.dg/maxloc_2.f90  2011-07-28 19:53:09.0 
+0200
@@ -0,0 +1,156 @@
+! { dg-do run }
+! { dg-add-options ieee }
+! { dg-skip-if "NaN not supported" { spu-*-* } { "*" } { "" } }
+  real :: a(3), nan, minf, pinf
+  real, allocatable :: c(:)
+  integer :: ia(1)
+  logical :: l
+  logical :: l2(3)
+
+  nan = 0.0
+  minf = 0.0
+  pinf = 0.0
+  nan = 0.0/nan
+  minf = -1.0/minf
+  pinf = 1.0/pinf
+
+  allocate (c(3))
+  a(:) = nan
+  ia = maxloc (a)
+  if (ia(1).ne.1) call abort
+  a(:) = minf
+  ia = maxloc (a)
+  if (ia(1).ne.1) call abort
+  a(1:2) = nan
+  ia = maxloc (a)
+  if (ia(1).ne.3) call abort
+  a(2) = 1.0
+  ia = maxloc (a)
+  if (ia(1).ne.2) call abort
+  a(2) = pinf
+  ia = maxloc (a)
+  if (ia(1).ne.2) call abort
+  c(:) = nan
+  ia = maxloc (c)
+  if (ia(1).ne.1) call abort
+  c(:) = minf
+  ia = maxloc (c)
+  if (ia(1).ne.1) call abort
+  c(1:2) = nan
+  ia = maxloc (c)
+  if (ia(1).ne.3) call abort
+  c(2) = 1.0
+  ia = maxloc (c)
+  if (ia(1).ne.2) call abort
+  c(2) = pinf
+  ia = maxloc (c)
+  if (ia(1).ne.2) call abort
+  l = .false.
+  l2(:) = .false.
+  a(:) = nan
+  ia = maxloc (a, mask = l)
+  if (ia(1).ne.0) call abort
+  ia = maxloc (a, mask = l2)
+  if (ia(1).ne.0) call abort
+  a(:) = minf
+  ia = maxloc (a, mask = l)
+  if (ia(1).ne.0) call abort
+  ia = maxloc (a, mask = l2)
+  if (ia(1).ne.0)

Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread DJ Delorie

Fails the same way with a parameter as with a local:

int result;
void __attribute__((naked)) ISRFunction(int x) {
   result = subFunction(&x);
}

but I'm certainly open to a better explanation of how a user program
can trigger an ICE that way.

Hmmm... the only use of targetm.calls.allocate_stack_slots_for_args()
is in use_register_for_decl() - which is used by expand_decl() for
automatic variables.

Maybe the hook should have been called allocate_stack_slots_for_decls() ?


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Pat Haugen

On 07/28/2011 04:43 AM, Paolo Carlini wrote:

/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:
In function 'constexpr float std::fma(float, float, float)':
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43:
sorry, unimplemented: unexpected ast of kind fma_expr
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43:
internal compiler error: in potential_constant_expression_1, at
cp/semantics.c:8094

in the past we encountered already a few small problems of this kind, with cases
missing from the potential_constant_expression_1 switch. I believe something
quite close to what I'm attaching below should be enough, can you give it a try?

In any case, we definitely want Jason to have a look as soon as possible. If you
want to restore the ia64 bootstrap in the meanwhile, feel free to comment out
any troublesome constexpr specifier in that file (or replacing it with inline).

Thanks!
Paolo.

//



p


Index: semantics.c
===
--- semantics.c (revision 176846)
+++ semantics.c (working copy)
@@ -8057,6 +8057,13 @@ potential_constant_expression_1 (tree t, bool want
  return false;
return true;

+case FMA_EXPR:
+  for (i = 0; i<  3; ++i)
+   if (!potential_constant_expression_1 (TREE_OPERAND (t, i),
+ true, flags))
+ return false;
+  return true;
+
  case COND_EXPR:
  case VEC_COND_EXPR:
/* If the condition is a known constant, we know which of the legs we


I am seeing the same error on PowerPC and the above patch fixes it.

-Pat


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Steve Ellcey
Paolo,

I ran into this problem on the IA64 HP-UX bootstrap and tried your
patch to semantics.c.  It fixed the problem for me and I was able
to bootstrap.

Steve Ellcey
s...@cup.hp.com


Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-28 Thread Richard Henderson
On 07/27/2011 10:00 AM, Georg-Johann Lay wrote:
> Richard Henderson wrote:
>>> +;; "*ashluqihiqi3.mem"
>>> +;; "*ashlsqihiqi3.mem"
>>> +(define_insn_and_split "*ashlqihiqi3.mem"
>>> +  [(set (match_operand:QI 0 "memory_operand" "=m")
>>> +(subreg:QI (ashift:HI (any_extend:HI (match_operand:QI 1 
>>> "register_operand" "r"))
>>> +  (match_operand:QI 2 "register_operand" "r"))
>>> +   0))]
>>> +  "!reload_completed"
>>> +  { gcc_unreachable(); }
>>
>> Surely this isn't necessary.  Why would you ever be matching a memory output?
>>
>>> +(define_insn_and_split "*ashlhiqi3"
>>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=r")
>>> +(subreg:QI (ashift:HI (match_operand:HI 1 "register_operand" "0")
>>> +  (match_operand:QI 2 "register_operand" "r")) 
>>> 0))]
>>> +  "!reload_completed"
>>> +  { gcc_unreachable(); }
>>
>> Likewise.
>>
>> But the first pattern and the peep2 look good.
>>
> 
> It's that what combine comes up with, and combine is not smart enough
> to find a split point between the mem and the subreg.  I don't know
> enough of combine, maybe it's because can_create_pseudo_p is false
> during combine, combine has no spare reg.  A combine-split won't
> help as it needs a pseudo/spare reg.

Hmm.  Perhaps.  Have you a test case for this?



r~


Re: PATCH: Fix config/i386/morestack.S for x32

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:42 PM, H.J. Lu wrote:
> +#ifdef __LP64__
>   movq%rax,%fs:0x70   # Save the new stack boundary.
> +#else
> + movl%eax,%fs:0x40   # Save the new stack boundary.
> +#endif

Please macro-ize this.


r~


Re: PATCH: Use long long for 64bit int in config/i386/64/sfp-machine.h

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:45 PM, H.J. Lu wrote:
>   * config/i386/64/sfp-machine.h (_FP_W_TYPE): Always use _WIN64
>   version.
>   (_FP_WS_TYPE): Likewise.
>   (_FP_I_TYPE): Likewise.

Ok.


r~


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 9:03 PM, H.J. Lu  wrote:

>>> This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?
>>
>> http://gcc.gnu.org/contribute.html#patches
>>
>
> Sorry. I should have mentioned testcase in:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47766
>
> Actually, they are in gcc testsuite.  I noticed them when
> I run gcc testsuite on x32.

This looks like a middle-end problem to me.

According to the documentation:

--quote--
`stack_protect_set'
 This pattern, if defined, moves a `Pmode' value from the memory in
 operand 1 to the memory in operand 0 without leaving the value in
 a register afterward.  This is to avoid leaking the value some
 place that an attacker might use to rewrite the stack guard slot
 after having clobbered it.

 If this pattern is not defined, then a plain move pattern is
 generated.

`stack_protect_test'
 This pattern, if defined, compares a `Pmode' value from the memory
 in operand 1 with the memory in operand 0 without leaving the
 value in a register afterward and branches to operand 2 if the
 values weren't equal.

 If this pattern is not defined, then a plain compare pattern and
 conditional branch pattern is used.
--quote--

According to the documentation, x86 patterns are correct. However,
middle-end fails to extend ptr_mode value to Pmode, and in function.c,
stack_protect_prologue/stack_protect_epilogue, we already have
ptr_mode (SImode) operand:

(mem/v/f/c/i:SI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -4 [0xfffc])) [2 D.2704+0 S4 A32])

(mem/v/f/c/i:SI (symbol_ref:DI ("__stack_chk_guard") [flags 0x40]
) [2 __stack_chk_guard+0 S4
A32])

An opinion of a RTL maintainer (CC'd) is needed here. Target
definition is OK in its current form.

Uros.


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:36 PM, Torvald Riegel wrote:
> No, this is correct because it calls the factory function in libitm_i.h.
> However, the classes in method-serial.cc were named differently than
> those factory functions, so I renamed them like this:
> 
> -class serial_dispatch : public abi_dispatch
> +class serialirr_dispatch : public abi_dispatch
> 
> -class serial_dispatch_ul : public abi_dispatch
> +class serial_dispatch : public abi_dispatch
> 
> This should avoid confusion in the future.

Excellent.

>> Don't we need to fini the old disp?  Seems there's a leak here, though
>> not visible until we re-instate the non-serial methods.
> 
> Yes, probably. However, one of the next steps on my refactoring list is
> to document and change the TM method lifecycle callbacks. This will
> include grouping several compatible methods (ie, those that can run
> together) into method sets (e.g., global lock, multiple locks).
> Switching a method within the current method set would then require no
> fini(), whereas switching the method set would require a more
> heavy-weight callback.
> 
> I have put the case you raised on my to-do list, and will revisit it
> when working on these lifecycle management changes.

Sounds good.


r~


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:38 PM, DJ Delorie wrote:
> Naked is related to TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS - ARM and RX
> set the latter based on the former, and nobody else uses that target
> hook.  So, naked functions don't have stack slots for args.  Without
> stack slots, args can't be assigned to memory locations - even if
> they're TREE_ADDRESSABLE.

But your test case has a local variable, not an argument at all.
So I don't immediately see the connection.


r~


Re: PATCH: Add x32 support to libgomp

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 12:45 PM, Jakub Jelinek  wrote:
> On Thu, Jul 28, 2011 at 12:32:26PM -0700, H.J. Lu wrote:
>> This patch fixes 2 issues in libgomp for x32:
>>
>> 1. x32 should use the same futex functions as x86-64.
>
> What kind of syscalls is -mx32 using on Linux?  I thought it was using the
> -m32 compat layer, isn't that the case?  If it works properly this way using
> a 64-bit syscall, including any needed sign/zero extensions of the arguments,
> fine.

x32 uses syscall instruction for system call with long == 32bit. For futex.
it calls the compat_sys_futex kernel function.  As far as kernel
is concerned,  x32 is the same as ia32 with x86-64 kernel calling convention.

>> 2. IA32 tests should check ia32 instead of ilp32.
>
> This is certainly fine.
>
>        Jakub
>

Thanks.


-- 
H.J.


Re: PATCH: Add x32 support to libgomp

2011-07-28 Thread Jakub Jelinek
On Thu, Jul 28, 2011 at 12:32:26PM -0700, H.J. Lu wrote:
> This patch fixes 2 issues in libgomp for x32:
> 
> 1. x32 should use the same futex functions as x86-64.

What kind of syscalls is -mx32 using on Linux?  I thought it was using the
-m32 compat layer, isn't that the case?  If it works properly this way using
a 64-bit syscall, including any needed sign/zero extensions of the arguments,
fine.

> 2. IA32 tests should check ia32 instead of ilp32.

This is certainly fine.

Jakub


PATCH: Use long long for 64bit int in config/i386/64/sfp-machine.h

2011-07-28 Thread H.J. Lu
Hi Ian,

For 64bit x86 targets, long is 32bit for x32 and win64.  But long long
is always 64bit.  This patch removes _WIN64 check.  OK for trunk?

Thanks.


H.J.
---
2010-07-28  H.J. Lu  

* config/i386/64/sfp-machine.h (_FP_W_TYPE): Always use _WIN64
version.
(_FP_WS_TYPE): Likewise.
(_FP_I_TYPE): Likewise.

diff --git a/libgcc/config/i386/64/sfp-machine.h 
b/libgcc/config/i386/64/sfp-machine.h
index 5adf6db..5debf5a 100644
--- a/libgcc/config/i386/64/sfp-machine.h
+++ b/libgcc/config/i386/64/sfp-machine.h
@@ -1,14 +1,8 @@
 #define _FP_W_TYPE_SIZE64
 
-#ifdef _WIN64
- #define _FP_W_TYPEunsigned long long
- #define _FP_WS_TYPE   signed long long
- #define _FP_I_TYPElong long
-#else
- #define _FP_W_TYPEunsigned long
- #define _FP_WS_TYPE   signed long
- #define _FP_I_TYPElong
-#endif
+#define _FP_W_TYPE unsigned long long
+#define _FP_WS_TYPEsigned long long
+#define _FP_I_TYPE long long
 
 typedef int TItype __attribute__ ((mode (TI)));
 typedef unsigned int UTItype __attribute__ ((mode (TI)));


PATCH: Fix config/i386/morestack.S for x32

2011-07-28 Thread H.J. Lu
Hi Ian,

x32 is similar to x86-64 with 32bit pointer size.  This patch adds x32
support to config/i386/morestack.S.  Tested on x32. OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

* config/i386/morestack.S: Properly save the x32 new stack
boundary.  Properly check __x86_64__ and __LP64__.

diff --git a/libgcc/config/i386/morestack.S b/libgcc/config/i386/morestack.S
index 16279c7..a745230 100644
--- a/libgcc/config/i386/morestack.S
+++ b/libgcc/config/i386/morestack.S
@@ -353,7 +353,11 @@ __morestack:
# FIXME: The offset must match
# TARGET_THREAD_SPLIT_STACK_OFFSET in
# gcc/config/i386/linux64.h.
+#ifdef __LP64__
movq%rax,%fs:0x70   # Save the new stack boundary.
+#else
+   movl%eax,%fs:0x40   # Save the new stack boundary.
+#endif
 
call__morestack_unblock_signals
 
@@ -391,7 +395,11 @@ __morestack:
subq0(%rsp),%rax# Subtract available space.
addq$BACKOFF,%rax   # Back off 1024 bytes.
 .LEHE0:
+#ifdef __LP64__
movq%rax,%fs:0x70   # Save the new stack boundary.
+#else
+   movl%eax,%fs:0x40   # Save the new stack boundary.
+#endif
 
addq$16,%rsp# Remove values from stack.
 
@@ -433,7 +441,11 @@ __morestack:
movq%rbp,%rcx   # Get the stack pointer.
subq%rax,%rcx   # Subtract available space.
addq$BACKOFF,%rcx   # Back off 1024 bytes.
+#ifdef __LP64__
movq%rcx,%fs:0x70   # Save new stack boundary.
+#else
+   movl%ecx,%fs:0x40   # Save new stack boundary.
+#endif
movq(%rsp),%rdi # Restore exception data for call.
 #ifdef __PIC__
call_Unwind_Resume@PLT  # Resume unwinding.
@@ -493,7 +505,7 @@ __x86.get_pc_thunk.bx:
.section 
.data.DW.ref.__gcc_personality_v0,"awG",@progbits,DW.ref.__gcc_personality_v0,comdat
.type   DW.ref.__gcc_personality_v0, @object
 DW.ref.__gcc_personality_v0:
-#ifndef __x86_64
+#ifndef __LP64__
.align 4
.size   DW.ref.__gcc_personality_v0, 4
.long   __gcc_personality_v0
@@ -504,7 +516,7 @@ DW.ref.__gcc_personality_v0:
 #endif
 #endif
 
-#ifdef __x86_64__
+#if defined __x86_64__ && defined __LP64__
 
 # This entry point is used for the large model.  With this entry point
 # the upper 32 bits of %r10 hold the argument size and the lower 32
@@ -537,7 +549,7 @@ __morestack_large_model:
.size   __morestack_large_model, . - __morestack_large_model
 #endif
 
-#endif /* __x86_64__ */
+#endif /* __x86_64__ && __LP64__ */
 
 # Initialize the stack test value when the program starts or when a
 # new thread starts.  We don't know how large the main stack is, so we
@@ -570,7 +582,11 @@ __stack_split_initialize:
 #else /* defined(__x86_64__) */
 
leaq-16000(%rsp),%rax   # We should have at least 16K.
+#ifdef __LP64__
movq%rax,%fs:0x70
+#else
+   movl%eax,%fs:0x40
+#endif
movq%rsp,%rdi
movq$16000,%rsi
 #ifdef __PIC__
@@ -592,7 +608,7 @@ __stack_split_initialize:
 
.section.ctors.65535,"aw",@progbits
 
-#ifndef __x86_64__
+#ifndef __LP64__
.align  4
.long   __stack_split_initialize
.long   __morestack_load_mmap


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 9:12 PM, Aldy Hernandez  wrote:
>
>> Yes.  Together with the above it looks then optimal.
>
> Attached patch tested on x86-64 Linux.
>
> OK for mainline?

Ok with the || moved to the next line as per coding-standards.

Thanks,
Richard.


Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Tom Tromey
> "Gabriel" == Gabriel Charette  writes:

Gabriel> 2011-07-28  Gabriel Charette  
Gabriel>* libcpp/include/line-map.h (struct line_maps):
Gabriel>   Remove unused field last_listed.

Ok.

Tom


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread DJ Delorie

Naked is related to TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS - ARM and RX
set the latter based on the former, and nobody else uses that target
hook.  So, naked functions don't have stack slots for args.  Without
stack slots, args can't be assigned to memory locations - even if
they're TREE_ADDRESSABLE.


Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Dodji Seketeli
Hello Gabriel,

gch...@google.com (Gabriel Charette) a écrit:

> 2011-07-28  Gabriel Charette  
>
>   * libcpp/include/line-map.h (struct line_maps):
>   Remove unused field last_listed.
>

I cannot approve or reject this patch, but FWIW, it looks good (and
obvious) to me.  I am CC-ing Tom.

Thanks.

> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 3234423..f1d5bee 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -76,11 +76,6 @@ struct GTY(()) line_maps {
>  
>unsigned int cache;
>  
> -  /* The most recently listed include stack, if any, starts with
> - LAST_LISTED as the topmost including file.  -1 indicates nothing
> - has been listed yet.  */
> -  int last_listed;
> -
>/* Depth of the include stack, including the current file.  */
>unsigned int depth;
>  
> diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> index 86e2484..dd3f11c 100644
> --- a/libcpp/line-map.c
> +++ b/libcpp/line-map.c
> @@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
>set->maps = NULL;
>set->allocated = 0;
>set->used = 0;
> -  set->last_listed = -1;
>set->trace_includes = false;
>set->depth = 0;
>set->cache = 0;
>
> --
> This patch is available for review at http://codereview.appspot.com/4810058

-- 
Dodji


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Torvald Riegel
On Thu, 2011-07-28 at 09:02 -0700, Richard Henderson wrote:
> > Add information to dispatch about closed nesting and uninstrumented 
> > code.
> > 
> > * dispatch.h (GTM::abi_dispatch): Add 
> > can_run_uninstrumented_code and
> > closed_nesting flags, as well as a closed nesting alternative.
> > * method-serial.cc: Same.
> 
> Nearly...
> 
> > +  virtual abi_dispatch* closed_nesting_alternative()
> > +  {
> > +// For nested transactions with an instrumented code path, we can do
> > +// undo logging.
> > +return GTM::dispatch_serial();
> 
> Surely you really mean dispatch_serial_ul here?
> Otherwise ok.

No, this is correct because it calls the factory function in libitm_i.h.
However, the classes in method-serial.cc were named differently than
those factory functions, so I renamed them like this:

-class serial_dispatch : public abi_dispatch
+class serialirr_dispatch : public abi_dispatch

-class serial_dispatch_ul : public abi_dispatch
+class serial_dispatch : public abi_dispatch

This should avoid confusion in the future.


> > Add closed nesting as restart reason.
> > 
> > * libitm_i.h: Add closed nesting as restart reason.
> > * retry.cc (GTM::gtm_transaction::decide_retry_strategy): Same.
> 
> Ok, except
> 
> > +  if (r == RESTART_CLOSED_NESTING) retry_serial = true;
> 
> Coding style.  THEN statement on the next line, even for small THEN.

Will try to keep that in mind ...

> > Make flat nesting the default, use closed nesting on demand.
> > 
> > * local.cc (gtm_transaction::rollback_local): Support closed 
> > nesting.
> > * eh_cpp.cc (GTM::gtm_transaction::revert_cpp_exceptions): Same.
> > * dispatch.h: Same.
> > * method-serial.cc: Same.
> > * beginend.cc (GTM::gtm_transaction::begin_transaction): Change 
> > to
> > flat nesting as default, and closed nesting on demand.
> > (GTM::gtm_transaction::rollback): Same.
> > (_ITM_abortTransaction): Same.
> > (GTM::gtm_transaction::restart): Same.
> > (GTM::gtm_transaction::trycommit): Same.
> > (GTM::gtm_transaction::trycommit_and_finalize): Removed.
> > (choose_code_path): New.
> > (GTM::gtm_transaction_cp::save): New.
> > (GTM::gtm_transaction_cp::commit): New.
> > * query.cc (_ITM_inTransaction): Support flat nesting.
> > * libitm_i.h (GTM::gtm_transaction_cp): New helper struct for 
> > nesting.
> > (GTM::gtm_transaction): Support flat and closed nesting.
> > * alloc.cc (commit_allocations_2): New.
> > (commit_cb_data): New helper struct.
> > (GTM::gtm_transaction::commit_allocations): Handle nested
> > commits/rollbacks.
> > * libitm.texi: Update user action section, add description of 
> > nesting.
> 
> Nearly...
> 
> > +  abi_dispatch *cn_disp = disp->closed_nesting_alternative();
> > +  if (cn_disp)
> > +{
> > +  disp = cn_disp;
> > +  set_abi_disp(disp);
> > +}
> 
> Don't we need to fini the old disp?  Seems there's a leak here, though
> not visible until we re-instate the non-serial methods.

Yes, probably. However, one of the next steps on my refactoring list is
to document and change the TM method lifecycle callbacks. This will
include grouping several compatible methods (ie, those that can run
together) into method sets (e.g., global lock, multiple locks).
Switching a method within the current method set would then require no
fini(), whereas switching the method set would require a more
heavy-weight callback.

I have put the case you raised on my to-do list, and will revisit it
when working on these lifecycle management changes.

> 
> > +  if (!(tx->state & STATE_IRREVOCABLE)) ret |= a_saveLiveVariables;

Fixed.

Committing to branch, together with the two more recent changes.


Torvald



Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Aldy Hernandez



   if (TREE_CODE (to) == COMPONENT_REF
   &&  DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
 get_bit_range (&bitregion_start,&bitregion_end,
to, tem, bitpos, bitsize);

and shouldn't this test DECL_BIT_FIELD instead of DECL_BIT_FIELD_TYPE?


As I mentioned here:

http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01416.html

I am using DECL_BIT_FIELD_TYPE instead of DECL_BIT_FIELD to determine if 
a DECL is a bit field because DECL_BIT_FIELD is not set for bit fields 
with mode sized number of bits (32-bits, 16-bits, etc).


PATCH: Add x32 support to libgomp

2011-07-28 Thread H.J. Lu
Hi,

This patch fixes 2 issues in libgomp for x32:

1. x32 should use the same futex functions as x86-64.
2. IA32 tests should check ia32 instead of ilp32.

OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

* config/linux/x86/futex.h: Check __x86_64__ instead of
__LP64__.

* testsuite/lib/libgomp.exp (libgomp_init): Add -march=i486
for ia32 instead of ilp32.

* testsuite/libgomp.c/atomic-1.c: Require ia32 instead of ilp32.
* testsuite/libgomp.c/atomic-6.c: Likewise.

diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h
index cb7461d..419f4d9 100644
--- a/libgomp/config/linux/x86/futex.h
+++ b/libgomp/config/linux/x86/futex.h
@@ -24,7 +24,7 @@
 
 /* Provide target-specific access to the futex system call.  */
 
-#ifdef __LP64__
+#ifdef __x86_64__
 # ifndef SYS_futex
 #  define SYS_futex202
 # endif
@@ -138,7 +138,7 @@ futex_wake (int *addr, int count)
 }
 }
 
-#endif /* __LP64__ */
+#endif /* __x86_64__ */
 
 static inline void
 cpu_relax (void)
diff --git a/libgomp/testsuite/lib/libgomp.exp 
b/libgomp/testsuite/lib/libgomp.exp
index 976543d..a75e22f 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -140,7 +140,7 @@ proc libgomp_init { args } {
 
 # We use atomic operations in the testcases to validate results.
 if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
-&& [check_effective_target_ilp32] } {
+&& [check_effective_target_ia32] } {
lappend ALWAYS_CFLAGS "additional_flags=-march=i486"
 }
 
diff --git a/libgomp/testsuite/libgomp.c/atomic-1.c 
b/libgomp/testsuite/libgomp.c/atomic-1.c
index b2be8f0..4725b7d 100644
--- a/libgomp/testsuite/libgomp.c/atomic-1.c
+++ b/libgomp/testsuite/libgomp.c/atomic-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -march=pentium" { target { { i?86-*-* x86_64-*-* } && 
ilp32 } } } */
+/* { dg-options "-O2 -march=pentium" { target { { i?86-*-* x86_64-*-* } && 
ia32 } } } */
 
 #ifdef __i386__
 #include "cpuid.h"
diff --git a/libgomp/testsuite/libgomp.c/atomic-6.c 
b/libgomp/testsuite/libgomp.c/atomic-6.c
index 59baf7d..f8ab75e 100644
--- a/libgomp/testsuite/libgomp.c/atomic-6.c
+++ b/libgomp/testsuite/libgomp.c/atomic-6.c
@@ -1,7 +1,7 @@
 /* PR middle-end/36106 */
 /* { dg-options "-O2" } */
 /* { dg-options "-O2 -mieee" { target alpha*-*-* } } */
-/* { dg-options "-O2 -march=i586" { target { { i?86-*-* x86_64-*-* } && ilp32 
} } } */
+/* { dg-options "-O2 -march=i586" { target { { i?86-*-* x86_64-*-* } && ia32 } 
} } */
 
 #ifdef __i386__
 # include "cpuid.h"


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 7:20 PM, Tom de Vries  wrote:
> On 07/28/2011 06:25 PM, Richard Guenther wrote:
>> On Thu, 28 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/28/2011 12:22 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 02:12 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>> Hi Richard,
>>
>> I have a patch set for bug 43513 - The stack pointer is adjusted 
>> twice.
>>
>> 01_pr43513.3.patch
>> 02_pr43513.3.test.patch
>> 03_pr43513.3.mudflap.patch
>>
>> The patch set has been bootstrapped and reg-tested on x86_64.
>>
>> I will sent out the patches individually.
>>
>
> The patch replaces a vla __builtin_alloca that has a constant 
> argument with an
> array declaration.
>
> OK for trunk?

 I don't think it is safe to try to get at the VLA type the way you do.
>>>
>>> I don't understand in what way it's not safe. Do you mean I don't 
>>> manage to find
>>> the type always, or that I find the wrong type, or something else?
>>
>> I think you might get the wrong type,
>
> Ok, I'll review that code one more time.
>
>> you also do not transform code
>> like
>>
>>   int *p = alloca(4);
>>   *p = 3;
>>
>> as there is no array type involved here.
>>
>
> I was trying to stay away from non-vla allocas.  A source declared alloca 
> has
> function livetime, so we could have a single alloca in a loop, called 10 
> times,
> with all 10 instances live at the same time. This patch does not detect 
> such
> cases, and thus stays away from non-vla allocas. A vla decl does not have 
> such
> problems, the lifetime ends when it goes out of scope.

 Yes indeed - that probably would require more detailed analysis.

 In fact I would simply do sth like

   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
   n_elem = size * 8 / BITS_PER_UNIT;
   array_type = build_array_type_nelts (elem_type, n_elem);
   var = create_tmp_var (array_type, NULL);
   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));

>>>
>>> I tried this code on the example, and it works, but the newly declared 
>>> type has
>>> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>>> This make
>>> the memory access in the example potentially unaligned, which prohibits 
>>> an
>>> ivopts optimization, so the resulting text size is 68 instead of the 64 
>>> achieved
>>> with my current patch.
>>
>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>> alignment that the targets alloca function would guarantee.
>>
>
> I tried that, but that doesn't help. It's the alignment of the type that
> matters, not of the decl.

 It shouldn't.  All accesses are performed with the original types and
 alignment comes from that (plus the underlying decl).

>>>
>>> I managed to get it all working by using build_aligned_type rather that 
>>> DECL_ALIGN.
>>
>> That's really odd, DECL_ALIGN should just work - nothing refers to the
>> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to
>> 1 maybe?
>>
>
> This doesn't work either.
>
>  /* Declare array.  */
>  elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>  n_elem = size * 8 / BITS_PER_UNIT;
>  align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>  array_type = build_array_type_nelts (elem_type, n_elem);
>  var = create_tmp_var (array_type, NULL);
>  DECL_ALIGN (var) = align;
>  DECL_USER_ALIGN (var) = 1;
>
> Maybe this clarifies it:
>
> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
> (gdb) call debug_generic_expr (ref)
> MEM[(int[0:D.2579] *)&D.2595][0]
> (gdb) call debug_generic_expr (step)
> 4
>
> 1627      base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
> (gdb) call debug_generic_expr (base)
> D.2595
>
> 1629      base_type = TREE_TYPE (base);
> (gdb) call debug_generic_expr (base_type)
> [40]
>
> 1630      base_align = TYPE_ALIGN (base_type);
> (gdb) p base_align
> $1 = 8
>
> So the align is 8-bits, and we return true here:

Ah, but this code should use get_object_alignment, not solely look
at the type.

Richard.

> (gdb) n
> 1632      if (mode != BLKmode)
> (gdb) n
> 1634          unsigned mode_align = GET_MODE_ALIGNMENT (mode);
> (gdb)
> 1636          if 

Re: [PATCH, PR 49886] Prevent fnsplit from changing signature when there are type attributes

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 6:52 PM, Martin Jambor  wrote:
> Hi,
>
> pass_split_functions is happy to split functions which have type
> attributes but cannot update them if the new clone has in any way
> different parameters than the original.  This can lead to
> miscompilations in cases like the testcase.
>
> This patch solves it by 1) making the inliner set the
> can_change_signature flag to false for them because their signature
> cannot be changed (this step is also necessary to make IPA-CP operate
> on them and handle them correctly), and 2) make the splitting pass
> keep all parameters if the flag is set.  The second step might involve
> inventing some default definitions if the parameters did not really
> have any.
>
> I spoke about this with Honza and he claimed that the new function is
> really an entirely different thing and that the parameters may
> correspond only very loosely and thus the type attributes should be
> cleared.  I'm not sure I agree, but in any case I needed this to work

For sure some attributes may be worth to preserve/change for optimziation
purposes.  Now, if a function is clonable then clearing all attributes from
the clone should be ok - there may be attributes that prevent cloning though
(or rather, need to be preserved).

Richard.

> to allow me continue with promised IPA-CP polishing and so I decided
> to do this because it was easier.  (My own opinion is that the current
> representation of parameter-describing function type attributes is
> evil and will cause harm no matter hat we do.)
>
> A very similar patch has passed bootstrap and testsuite on
> x86_64-linux, the current one is undergoing both right now.  OK for
> trunk if it passes?
>
> Thanks,
>
> Martin
>
>
>
> 2011-07-28  Martin Jambor  
>
>        PR middle-end/49886
>        * ipa-inline-analysis.c (compute_inline_parameters): Set
>        can_change_signature of noes with typde attributes.
>        * ipa-split.c (split_function): Do not skip any arguments if
>        can_change_signature is set.
>
>        * testsuite/gcc.c-torture/execute/pr49886.c: New testcase.
>
> Index: src/gcc/ipa-inline-analysis.c
> ===
> --- src.orig/gcc/ipa-inline-analysis.c
> +++ src/gcc/ipa-inline-analysis.c
> @@ -1658,18 +1658,24 @@ compute_inline_parameters (struct cgraph
>   /* Can this function be inlined at all?  */
>   info->inlinable = tree_inlinable_function_p (node->decl);
>
> -  /* Inlinable functions always can change signature.  */
> -  if (info->inlinable)
> -    node->local.can_change_signature = true;
> +  /* Type attributes can use parameter indices to describe them.  */
> +  if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
> +    node->local.can_change_signature = false;
>   else
>     {
> -      /* Functions calling builtin_apply can not change signature.  */
> -      for (e = node->callees; e; e = e->next_callee)
> -       if (DECL_BUILT_IN (e->callee->decl)
> -           && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
> -           && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
> -         break;
> -      node->local.can_change_signature = !e;
> +      /* Otherwise, inlinable functions always can change signature.  */
> +      if (info->inlinable)
> +       node->local.can_change_signature = true;
> +      else
> +       {
> +         /* Functions calling builtin_apply can not change signature.  */
> +         for (e = node->callees; e; e = e->next_callee)
> +           if (DECL_BUILT_IN (e->callee->decl)
> +               && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
> +               && DECL_FUNCTION_CODE (e->callee->decl) == 
> BUILT_IN_APPLY_ARGS)
> +             break;
> +         node->local.can_change_signature = !e;
> +       }
>     }
>   estimate_function_body_sizes (node, early);
>
> Index: src/gcc/ipa-split.c
> ===
> --- src.orig/gcc/ipa-split.c
> +++ src/gcc/ipa-split.c
> @@ -945,10 +945,10 @@ static void
>  split_function (struct split_point *split_point)
>  {
>   VEC (tree, heap) *args_to_pass = NULL;
> -  bitmap args_to_skip = BITMAP_ALLOC (NULL);
> +  bitmap args_to_skip;
>   tree parm;
>   int num = 0;
> -  struct cgraph_node *node;
> +  struct cgraph_node *node, *cur_node = cgraph_get_node 
> (current_function_decl);
>   basic_block return_bb = find_return_bb ();
>   basic_block call_bb;
>   gimple_stmt_iterator gsi;
> @@ -968,17 +968,30 @@ split_function (struct split_point *spli
>       dump_split_point (dump_file, split_point);
>     }
>
> +  if (cur_node->local.can_change_signature)
> +    args_to_skip = BITMAP_ALLOC (NULL);
> +  else
> +    args_to_skip = NULL;
> +
>   /* Collect the parameters of new function and args_to_skip bitmap.  */
>   for (parm = DECL_ARGUMENTS (current_function_decl);
>        parm; parm = DECL_CHAIN (parm), num++)
> -    if (!is_gimple_reg (parm)
> -       || !gimple_default_def (cfun, parm)
> -     

Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread Richard Henderson
On 07/26/2011 12:52 PM, DJ Delorie wrote:
> This patch tests for at least one user-caused reason for this
> assertion failing - requiring a local frame in a naked function.  For
> this case at least, it would be better to trigger an error than to
> ICE.  OK?
> 
> static int bar;
> void __attribute__((naked)) function(void) {
>int foo, result;
>result = subFunction(&foo, &bar);   // ICE here
> }
> 
>   * expr.c (expand_expr_addr_expr_1): Detect a user request for
>   a local frame in a naked function, and produce a suitable
>   error for that specific case.

Err... why would naked affect anything at all this early in the
compilation path?

Without an answer to that, this seems like an odd place to patch.



r~


Re: [PATCH] Improve call site argument debug info for floating point stack arguments (PR debug/49846)

2011-07-28 Thread Richard Henderson
On 07/26/2011 01:19 PM, Jakub Jelinek wrote:
>   PR debug/49846
>   * var-tracking.c (prepare_call_arguments): For non-MODE_INT stack
>   arguments also check if they aren't initialized with a MODE_INT
>   mode of the same size.

Ok.


r~


Re: [PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-28 Thread Richard Henderson
On 07/26/2011 04:13 PM, Delesley Hutchins wrote:
> This patch fixes a bug in pointer_set.c, where removing a pointer from
> a pointer set would corrupt the hash table if the pointer was involved
> in any hash collisions.
> 
> Bootstrapped and passed gcc regression testsuite on x86_64-unknown-linux-gnu.
> 
> Okay for google/gcc-4_6?
> 
>   -DeLesley
> 
> gcc/Changelog.annotalysis:
> 2011-7-26  DeLesley Hutchins  
> 
> * gcc/pointer-set.c (pointer_set_delete)  bugfix for case of
> hash collisions

The logic of the patch looks good.  I certainly agree it's a real bug.

> +  /* find location of p */

But please fix up all the comments to be properly punctuated sentences.

> +  pset->slots[n] = 0;  /* remove ptr from set. */

And avoid end-of-line comments.  Put it on the previous line.


r~


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:02 PM, David Edelsohn wrote:
> The other problem is hoisting the store into the prologue is not
> always profitable for performance.  It should be better once shrink
> wrapping is implemented.  Currently the PPC ABI may perform a lot of
> stores in the prologue if the function *may* make a call.  R2 adds yet
> another store to the common path.

Well, even if we're not able to hoist the R2 store, we may be able
to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns
in the stream.


r~


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:55 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:13 PM, H.J. Lu  wrote:
>
>> This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?
>
> http://gcc.gnu.org/contribute.html#patches
>

Sorry. I should have mentioned testcase in:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47766

Actually, they are in gcc testsuite.  I noticed them when
I run gcc testsuite on x32.

-- 
H.J.


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread David Edelsohn
On Thu, Jul 28, 2011 at 2:49 PM, Richard Henderson  wrote:

> The whole problem is that toc pointer copy in 40(1) is only valid
> during indirect call sequences, and iff ld inserted a stub?  I.e.
> direct calls between functions that share toc pointers never save
> the copy?
>
> Would it make sense, if a function has any indirect call, to move
> the toc pointer save into the prologue?  You'd get to avoid that
> store all the time.  Of course you'd not be able to sink the load
> after the call, but it might still be a win.  And in that special
> case you can annotate the r2 save slot just once, correctly.

Michael Meissner recently did move R2 save into the prologue, under
certain circumstances.  See TARGET_SAVE_TOC_INDIRECT.  Limitations
include alloca (unless one re-copies the R2.  Mike also encountered
some problems with EH, which may be related to this discussion.

The other problem is hoisting the store into the prologue is not
always profitable for performance.  It should be better once shrink
wrapping is implemented.  Currently the PPC ABI may perform a lot of
stores in the prologue if the function *may* make a call.  R2 adds yet
another store to the common path.

- David


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:49 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:32 PM, H.J. Lu  wrote:
>
 >  convert_memory_address_addr_space has a special PLUS/MULT case for
 >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
 >  for all Pmode != ptr_mode cases. ?OK for trunk?
 >  2011-06-11 ?H.J. Lu ?
 >
 >  ? ? ? ?PR middle-end/47727
 >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
 >  ? ? ? ?conversion and addition if one operand is a constant.

 Do we still need this patch? With recent target changes the testcase
 from PR can be compiled without problems with a gcc from an unpatched
 trunk.
>>>
>>> Given the communication difficulties, I hope not...
>>>
>>> Paolo
>>>
>>
>> Here is the updated patch.  OK for trunk?
>
> Did you see the question two levels up the thread you are replying to?
>

 The patch is for

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721

 I changed the thread subject.
>>>
>>> Please add testcase to see the patch in action.
>>>
>>
>> I haven't found a testcase yet.  The problem was discovered in
>> this thread:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html
>
> This was before x32 could handle SImode addresses. With recent x86
> target work, this is no more true, and SImode and DImode addresses are
> first-class citizens as far as x32 backend is concerned. Please note
> that original testcase (that this whole patch is all about) now
> compiles without problems. Also, middle end is shared with at least
> two ptr_mode != Pmode targets, and they all work well. So, to see what
> makes x32 special, we need a testcase that breaks _WITHOUT_ your
> proposed patch. Without testcase, nobody can analyze your approach and
> tell if the approach is the right one, if this is in fact target
> problem, or indeed a middle-end problem.
>

There are 2 issues

1. rtl_hooks.gen_lowpart_no_emit vs gen_lowpart. simplify-rtx.c shouldn't
generate any new insns.
2. convert_memory_address_addr_space shouldn't permute conversion and
addition.


-- 
H.J.


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:13 PM, H.J. Lu  wrote:

> This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?

http://gcc.gnu.org/contribute.html#patches

Uros.


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:32 PM, H.J. Lu  wrote:

>>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>>> >  2011-06-11 ?H.J. Lu ?
>>> >
>>> >  ? ? ? ?PR middle-end/47727
>>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>>
>>> Do we still need this patch? With recent target changes the testcase
>>> from PR can be compiled without problems with a gcc from an unpatched
>>> trunk.
>>
>> Given the communication difficulties, I hope not...
>>
>> Paolo
>>
>
> Here is the updated patch.  OK for trunk?

 Did you see the question two levels up the thread you are replying to?

>>>
>>> The patch is for
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>>
>>> I changed the thread subject.
>>
>> Please add testcase to see the patch in action.
>>
>
> I haven't found a testcase yet.  The problem was discovered in
> this thread:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html

This was before x32 could handle SImode addresses. With recent x86
target work, this is no more true, and SImode and DImode addresses are
first-class citizens as far as x32 backend is concerned. Please note
that original testcase (that this whole patch is all about) now
compiles without problems. Also, middle end is shared with at least
two ptr_mode != Pmode targets, and they all work well. So, to see what
makes x32 special, we need a testcase that breaks _WITHOUT_ your
proposed patch. Without testcase, nobody can analyze your approach and
tell if the approach is the right one, if this is in fact target
problem, or indeed a middle-end problem.

And there is no point to flood the mainling-list with patches.

Uros.


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:27 AM, Alan Modra wrote:
> On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
>> Ideally what I'd like to
>> do is have ld and gcc emit accurate r2 tracking unwind info and
>> dispense with hacks like frob_update_context.  If ld did emit accurate
>> unwind info for .glink, then the justification for frob_update_context
>> disappears.
> 
> For the record, this statement of mine doesn't make sense.  A .glink
> stub doesn't make a frame, so a backtrace won't normally pass through a
> stub, thus having accurate unwind info for .glink doesn't help at all.

It does, for the duration of the stub.

The whole problem is that toc pointer copy in 40(1) is only valid
during indirect call sequences, and iff ld inserted a stub?  I.e.
direct calls between functions that share toc pointers never save
the copy?

Would it make sense, if a function has any indirect call, to move
the toc pointer save into the prologue?  You'd get to avoid that
store all the time.  Of course you'd not be able to sink the load
after the call, but it might still be a win.  And in that special
case you can annotate the r2 save slot just once, correctly.

For functions that do not contain an indirect function call, I
don't believe that there's a any way to use DW_CFA_offset that
is always correct.

One could, however, move the code in frob_update_context into a
(series of) DW_CFA_val_expression's.

  DW_CFA_val_expression
DW_OP_reg2  // Default to the value currently in R2
DW_OP_regx LR   // Test the insn following the call, as per 
frob_update_context
DW_OP_deref_size 4
DW_OP_const4u 0xE8410028
DW_OP_ne
DW_OP_bra L1
DW_OP_drop  // Could be omitted, given that we only examine 
top-of-stack at the end
DW_OP_breg1 40  // Pull the value from *(R1+40)
DW_OP_deref
  L1:

This version could appear in the CIE.  You'd have to adjust it
once LR gets saved to the stack, and R2 isn't itself being saved
as per above.

There isn't currently a hook in dwarf2cfi to add extra stuff to
the CIE program, but that wouldn't be hard to add.  The version
that gets emitted after LR is saved would need a new note as well.
But it all seems fairly tractable to actually implement, if we
think it'll actually solve the problem.


r~


Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-28 Thread Tobias Grosser

On 07/28/2011 06:56 PM, Sebastian Pop wrote:

Hi Tobi,

On Thu, Jul 28, 2011 at 12:13, Tobias Grosser  wrote:

+  struct clast_user_stmt *body
+= clast_get_body_of_loop ((struct clast_stmt *) stmt);


I am not a big fan of using clast_get_body_of_loop as it is buggy.
Introducing new uses of it, is nothing what I would support. Do we really
need this?


No, because of ...




+  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);


What about some more meaningful names like bound_one, bound_two?


Ok, see the second patch attached.


+
+  compute_bounds_for_level (pbb, level, v1, v2);


Mh. I do not completely understand all the code. But can't we get v1 and v2
set without the need for the compute_bounds_for_level function. Is the
type_for_clast_expression not setting them.



... this.
You are right.  type_for_clast_expr would provide the bounds for the
RHS of the assign and so we don't need to compute the bounds on
the loop level, as we would have done on a real loop.  Attached the
amended patch.  I'm regstrapping these patches on amd64-linux.
Ok for trunk after?


Looks good to me. Please commit if it passes regstrapping.

Cheers
Tobi


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:23 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:09 PM, H.J. Lu  wrote:
>
>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>> >  2011-06-11 ?H.J. Lu ?
>> >
>> >  ? ? ? ?PR middle-end/47727
>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>
>> Do we still need this patch? With recent target changes the testcase
>> from PR can be compiled without problems with a gcc from an unpatched
>> trunk.
>
> Given the communication difficulties, I hope not...
>
> Paolo
>

 Here is the updated patch.  OK for trunk?
>>>
>>> Did you see the question two levels up the thread you are replying to?
>>>
>>
>> The patch is for
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>
>> I changed the thread subject.
>
> Please add testcase to see the patch in action.
>

I haven't found a testcase yet.  The problem was discovered in
this thread:

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html


-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:30 PM, H.J. Lu  wrote:

> TP is 32bit in x32  For load_tp_x32, we load SImode value and
> zero-extend to DImode. For add_tp_x32, we are adding SImode
> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
> must take SImode TP.
>

 I will see what I can do.

>>>
>>> Here is the updated patch to use 32bit TP for 32.
>>
>> Why??
>>
>> This part makes no sense:
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>
>> You will create zero_extend (unspec ...), that won't be matched by any 
>> pattern.
>
> No.  I created  zero_exten from (reg:SI) to (reg: DI).
>
>> Can you please explain, how is this pattern different than DImode
>> pattern, proposed in my patch?
>>
>> +(define_insn "*load_tp_x32"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +       (unspec:SI [(const_int 0)] UNSPEC_TP))]
>> +  "TARGET_X32"
>> +  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
>> +  [(set_attr "type" "imov")
>> +   (set_attr "modrm" "0")
>> +   (set_attr "length" "7")
>> +   (set_attr "memory" "load")
>> +   (set_attr "imm_disp" "false")])
>>
>> vs:
>>
>> +(define_insn "*load_tp_x32"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +       (unspec:DI [(const_int 0)] UNSPEC_TP))]
>
> That is wrong since source (TP)  is 32bit.  This pattern tells compiler
> source is 64bit.

Where?

Uros.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:21 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:03 PM, H.J. Lu  wrote:
>
>>> So, instead of huge complications with new mode iterator, just
>>> introduce two new patterns that will shadow existing ones for
>>> TARGET_X32.
>>>
>>> Like in attached (untested) patch.
>>>
>>
>> I tried the following patch with typos fixed.  It almost worked,
>> except for this failure in glibc testsuite:
>>
>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
>> failed
>> make[4]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>> Error 1
>>
>> I will add:
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 8723dc5..d32d64d 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>  {
>>   rtx tp, reg, insn;
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>   if (!to_reg)
>>     return tp;
>>
>> since TP must be 32bit.
>
> No, this won't have the desired effect. It will change the UNSPEC, so
> it won't match patterns in i386.md.
>
> Can you debug the failure a bit more? With my patterns, add{l} and
> mov{l} should clear top 32bits.
>

 TP is 32bit in x32  For load_tp_x32, we load SImode value and
 zero-extend to DImode. For add_tp_x32, we are adding SImode
 value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
 must take SImode TP.

>>>
>>> I will see what I can do.
>>>
>>
>> Here is the updated patch to use 32bit TP for 32.
>
> Why??
>
> This part makes no sense:
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>
> You will create zero_extend (unspec ...), that won't be matched by any 
> pattern.

No.  I created  zero_exten from (reg:SI) to (reg: DI).

> Can you please explain, how is this pattern different than DImode
> pattern, proposed in my patch?
>
> +(define_insn "*load_tp_x32"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +       (unspec:SI [(const_int 0)] UNSPEC_TP))]
> +  "TARGET_X32"
> +  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
> +  [(set_attr "type" "imov")
> +   (set_attr "modrm" "0")
> +   (set_attr "length" "7")
> +   (set_attr "memory" "load")
> +   (set_attr "imm_disp" "false")])
>
> vs:
>
> +(define_insn "*load_tp_x32"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +       (unspec:DI [(const_int 0)] UNSPEC_TP))]

That is wrong since source (TP)  is 32bit.  This pattern tells compiler
source is 64bit.

> +  "TARGET_X32"
> +  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
> +  [(set_attr "type" "imov")
> +   (set_attr "modrm" "0")
> +   (set_attr "length" "7")
> +   (set_attr "memory" "load")
> +   (set_attr "imm_disp" "false")])
>

I will try zero_extend to see if it works.

Thanks.


-- 
H.J.


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:09 PM, H.J. Lu  wrote:

> >  convert_memory_address_addr_space has a special PLUS/MULT case for
> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
> >  for all Pmode != ptr_mode cases. ?OK for trunk?
> >  2011-06-11 ?H.J. Lu ?
> >
> >  ? ? ? ?PR middle-end/47727
> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
> >  ? ? ? ?conversion and addition if one operand is a constant.
>
> Do we still need this patch? With recent target changes the testcase
> from PR can be compiled without problems with a gcc from an unpatched
> trunk.

 Given the communication difficulties, I hope not...

 Paolo

>>>
>>> Here is the updated patch.  OK for trunk?
>>
>> Did you see the question two levels up the thread you are replying to?
>>
>
> The patch is for
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>
> I changed the thread subject.

Please add testcase to see the patch in action.

Uros.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:03 PM, H.J. Lu  wrote:

>> So, instead of huge complications with new mode iterator, just
>> introduce two new patterns that will shadow existing ones for
>> TARGET_X32.
>>
>> Like in attached (untested) patch.
>>
>
> I tried the following patch with typos fixed.  It almost worked,
> except for this failure in glibc testsuite:
>
> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
> -c -f $charmap -i $input ${common_objpfx}localedata/$out
> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
> failed
> make[4]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
> Error 1
>
> I will add:
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8723dc5..d32d64d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>  {
>   rtx tp, reg, insn;
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>   if (!to_reg)
>     return tp;
>
> since TP must be 32bit.

 No, this won't have the desired effect. It will change the UNSPEC, so
 it won't match patterns in i386.md.

 Can you debug the failure a bit more? With my patterns, add{l} and
 mov{l} should clear top 32bits.

>>>
>>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>>> must take SImode TP.
>>>
>>
>> I will see what I can do.
>>
>
> Here is the updated patch to use 32bit TP for 32.

Why??

This part makes no sense:

-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  if (ptr_mode != Pmode)
+tp = convert_to_mode (Pmode, tp, 1);

You will create zero_extend (unspec ...), that won't be matched by any pattern.

Can you please explain, how is this pattern different than DImode
pattern, proposed in my patch?

+(define_insn "*load_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])

vs:

+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])

Uros.


Re: [patch] attribute to reverse bitfield allocations

2011-07-28 Thread DJ Delorie

> Seeing little opposition, I plod further...  now with documentation
> and a test case.  OK yet?

Ping?

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html


PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread H.J. Lu
Hi,

This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

PR target/47766
* config/i386/i386.md (PTR): New.
(stack_protect_set: Check TARGET_LP64 instead of TARGET_64BIT.
(stack_protect_test): Likewise.
(stack_protect_set_): Replace ":P" with ":PTR".
(stack_tls_protect_set_): Likewise.
(stack_tls_protect_test_): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f33b8a0..f4717b5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -951,6 +951,11 @@
 ;; This mode iterator allows :P to be used for patterns that operate on
 ;; pointer-sized quantities.  Exactly one of the two alternatives will match.
 (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; This mode iterator allows :PTR to be used for patterns that operate on
+;; ptr_mode sized quantities.
+(define_mode_iterator PTR
+  [(SI "ptr_mode == SImode") (DI "ptr_mode == DImode")])
 
 ;; Scheduling descriptions
 
@@ -17347,11 +17379,11 @@
 
 #ifdef TARGET_THREAD_SSP_OFFSET
   operands[1] = GEN_INT (TARGET_THREAD_SSP_OFFSET);
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_tls_protect_set_di
  : gen_stack_tls_protect_set_si);
 #else
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_protect_set_di
  : gen_stack_protect_set_si);
 #endif
@@ -17361,19 +17393,20 @@
 })
 
 (define_insn "stack_protect_set_"
-  [(set (match_operand:P 0 "memory_operand" "=m")
-   (unspec:P [(match_operand:P 1 "memory_operand" "m")] UNSPEC_SP_SET))
-   (set (match_scratch:P 2 "=&r") (const_int 0))
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")]
+   UNSPEC_SP_SET))
+   (set (match_scratch:PTR 2 "=&r") (const_int 0))
(clobber (reg:CC FLAGS_REG))]
   ""
   "mov{}\t{%1, %2|%2, %1}\;mov{}\t{%2, %0|%0, 
%2}\;xor{l}\t%k2, %k2"
   [(set_attr "type" "multi")])
 
 (define_insn "stack_tls_protect_set_"
-  [(set (match_operand:P 0 "memory_operand" "=m")
-   (unspec:P [(match_operand:P 1 "const_int_operand" "i")]
- UNSPEC_SP_TLS_SET))
-   (set (match_scratch:P 2 "=&r") (const_int 0))
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "const_int_operand" "i")]
+   UNSPEC_SP_TLS_SET))
+   (set (match_scratch:PTR 2 "=&r") (const_int 0))
(clobber (reg:CC FLAGS_REG))]
   ""
   "mov{}\t{%@:%P1, %2|%2,  PTR 
%@:%P1}\;mov{}\t{%2, %0|%0, %2}\;xor{l}\t%k2, %k2"
@@ -17391,11 +17424,11 @@
 
 #ifdef TARGET_THREAD_SSP_OFFSET
   operands[1] = GEN_INT (TARGET_THREAD_SSP_OFFSET);
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_tls_protect_test_di
  : gen_stack_tls_protect_test_si);
 #else
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_protect_test_di
  : gen_stack_protect_test_si);
 #endif
@@ -17409,20 +17442,20 @@
 
 (define_insn "stack_protect_test_"
   [(set (match_operand:CCZ 0 "flags_reg_operand" "")
-   (unspec:CCZ [(match_operand:P 1 "memory_operand" "m")
-(match_operand:P 2 "memory_operand" "m")]
+   (unspec:CCZ [(match_operand:PTR 1 "memory_operand" "m")
+(match_operand:PTR 2 "memory_operand" "m")]
UNSPEC_SP_TEST))
-   (clobber (match_scratch:P 3 "=&r"))]
+   (clobber (match_scratch:PTR 3 "=&r"))]
   ""
   "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%2, %3|%3, %2}"
   [(set_attr "type" "multi")])
 
 (define_insn "stack_tls_protect_test_"
   [(set (match_operand:CCZ 0 "flags_reg_operand" "")
-   (unspec:CCZ [(match_operand:P 1 "memory_operand" "m")
-(match_operand:P 2 "const_int_operand" "i")]
+   (unspec:CCZ [(match_operand:PTR 1 "memory_operand" "m")
+(match_operand:PTR 2 "const_int_operand" "i")]
UNSPEC_SP_TLS_TEST))
-   (clobber (match_scratch:P 3 "=r"))]
+   (clobber (match_scratch:PTR 3 "=r"))]
   ""
   "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%@:%P2, %3|%3, 
 PTR %@:%P2}"
   [(set_attr "type" "multi")])


PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:05 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 7:59 PM, H.J. Lu  wrote:
>
 >  convert_memory_address_addr_space has a special PLUS/MULT case for
 >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
 >  for all Pmode != ptr_mode cases. ?OK for trunk?
 >  2011-06-11 ?H.J. Lu ?
 >
 >  ? ? ? ?PR middle-end/47727
 >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
 >  ? ? ? ?conversion and addition if one operand is a constant.

 Do we still need this patch? With recent target changes the testcase
 from PR can be compiled without problems with a gcc from an unpatched
 trunk.
>>>
>>> Given the communication difficulties, I hope not...
>>>
>>> Paolo
>>>
>>
>> Here is the updated patch.  OK for trunk?
>
> Did you see the question two levels up the thread you are replying to?
>

The patch is for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721

I changed the thread subject.

-- 
H.J.


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 7:59 PM, H.J. Lu  wrote:

>>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>>> >  2011-06-11 ?H.J. Lu ?
>>> >
>>> >  ? ? ? ?PR middle-end/47727
>>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>>
>>> Do we still need this patch? With recent target changes the testcase
>>> from PR can be compiled without problems with a gcc from an unpatched
>>> trunk.
>>
>> Given the communication difficulties, I hope not...
>>
>> Paolo
>>
>
> Here is the updated patch.  OK for trunk?

Did you see the question two levels up the thread you are replying to?

Uros.


Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread gchare

Forgot to mention:

Tested with bootstrap build and full regression testing.

On 2011/07/28 17:55:15, Gabriel Charette wrote:

The last_listed field in struct line_maps was never used, removed it.



Gab



2011-07-28  Gabriel Charette  



* libcpp/include/line-map.h (struct line_maps):
   Remove unused field last_listed.



diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 3234423..f1d5bee 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -76,11 +76,6 @@ struct GTY(()) line_maps {



unsigned int cache;



-  /* The most recently listed include stack, if any, starts with
- LAST_LISTED as the topmost including file.  -1 indicates nothing
- has been listed yet.  */
-  int last_listed;
-
/* Depth of the include stack, including the current file.  */
unsigned int depth;



diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 86e2484..dd3f11c 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
set->maps = NULL;
set->allocated = 0;
set->used = 0;
-  set->last_listed = -1;
set->trace_includes = false;
set->depth = 0;
set->cache = 0;



--
This patch is available for review at

http://codereview.appspot.com/4810058



http://codereview.appspot.com/4810058/


Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Gabriel Charette
The last_listed field in struct line_maps was never used, removed it.

Gab

2011-07-28  Gabriel Charette  

* libcpp/include/line-map.h (struct line_maps):
  Remove unused field last_listed.

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 3234423..f1d5bee 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -76,11 +76,6 @@ struct GTY(()) line_maps {
 
   unsigned int cache;
 
-  /* The most recently listed include stack, if any, starts with
- LAST_LISTED as the topmost including file.  -1 indicates nothing
- has been listed yet.  */
-  int last_listed;
-
   /* Depth of the include stack, including the current file.  */
   unsigned int depth;
 
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 86e2484..dd3f11c 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
   set->maps = NULL;
   set->allocated = 0;
   set->used = 0;
-  set->last_listed = -1;
   set->trace_includes = false;
   set->depth = 0;
   set->cache = 0;

--
This patch is available for review at http://codereview.appspot.com/4810058


[pph] Free buffers used during tree encoding/decoding

2011-07-28 Thread Diego Novillo
Noticed this while debugging the new tree encoding cache.  No
functional changes.  This frees the memory used by the buffers
used during tree streaming.  It also moves the reader and writer
data into a union to better distinguish them.

Tested on x86_64.


Diego.


* pph-streamer.h (pph_stream): Move fields OB, OUT_STATE,
DECL_STATE_STREAM, IB, DATA_IN, PPH_SECTIONS, FILE_DATA and
FILE_SIZE into a union of structures.
Update all users.
* pph-streamer.c (pph_stream_close): Free memory used by tree
encoding routines.

Index: cp/pph-streamer-in.c
===
--- cp/pph-streamer-in.c(revision 176879)
+++ cp/pph-streamer-in.c(working copy)
@@ -109,8 +109,8 @@ pph_get_section_data (struct lto_file_de
 {
   /* FIXME pph - Stop abusing lto_file_decl_data fields.  */
   const pph_stream *stream = (const pph_stream *) file_data->file_name;
-  *len = stream->file_size - sizeof (pph_file_header);
-  return (const char *) stream->file_data + sizeof (pph_file_header);
+  *len = stream->encoder.r.file_size - sizeof (pph_file_header);
+  return (const char *) stream->encoder.r.file_data + sizeof (pph_file_header);
 }
 
 
@@ -119,14 +119,14 @@ pph_get_section_data (struct lto_file_de
 
 static void
 pph_free_section_data (struct lto_file_decl_data *file_data,
-  enum lto_section_type section_type ATTRIBUTE_UNUSED,
-  const char *name ATTRIBUTE_UNUSED,
-  const char *offset ATTRIBUTE_UNUSED,
-  size_t len ATTRIBUTE_UNUSED)
+  enum lto_section_type section_type ATTRIBUTE_UNUSED,
+  const char *name ATTRIBUTE_UNUSED,
+  const char *offset ATTRIBUTE_UNUSED,
+  size_t len ATTRIBUTE_UNUSED)
 {
   /* FIXME pph - Stop abusing lto_file_decl_data fields.  */
   const pph_stream *stream = (const pph_stream *) file_data->file_name;
-  free (stream->file_data);
+  free (stream->encoder.r.file_data);
 }
 
 
@@ -145,46 +145,48 @@ pph_init_read (pph_stream *stream)
 
   lto_reader_init ();
 
-  /* Read STREAM->NAME into the memory buffer STREAM->FILE_DATA.
- FIXME pph, we are reading the whole file at once.  This seems
- wasteful.  */
+  /* Read STREAM->NAME into the memory buffer stream->encoder.r.file_data.  */
   retcode = fstat (fileno (stream->file), &st);
   gcc_assert (retcode == 0);
-  stream->file_size = (size_t) st.st_size;
-  stream->file_data = XCNEWVEC (char, stream->file_size);
-  bytes_read = fread (stream->file_data, 1, stream->file_size, stream->file);
-  gcc_assert (bytes_read == stream->file_size);
+  stream->encoder.r.file_size = (size_t) st.st_size;
+  stream->encoder.r.file_data = XCNEWVEC (char, stream->encoder.r.file_size);
+  bytes_read = fread (stream->encoder.r.file_data, 1,
+ stream->encoder.r.file_size, stream->file);
+  gcc_assert (bytes_read == stream->encoder.r.file_size);
 
   /* Set LTO callbacks to read the PPH file.  */
-  stream->pph_sections = XCNEWVEC (struct lto_file_decl_data *,
-  PPH_NUM_SECTIONS);
+  stream->encoder.r.pph_sections = XCNEWVEC (struct lto_file_decl_data *,
+PPH_NUM_SECTIONS);
   for (i = 0; i < PPH_NUM_SECTIONS; i++)
 {
-  stream->pph_sections[i] = XCNEW (struct lto_file_decl_data);
+  stream->encoder.r.pph_sections[i] = XCNEW (struct lto_file_decl_data);
   /* FIXME pph - Stop abusing fields in lto_file_decl_data.  */
-  stream->pph_sections[i]->file_name = (const char *) stream;
+  stream->encoder.r.pph_sections[i]->file_name = (const char *) stream;
 }
 
-  lto_set_in_hooks (stream->pph_sections, pph_get_section_data,
+  lto_set_in_hooks (stream->encoder.r.pph_sections, pph_get_section_data,
pph_free_section_data);
 
-  header = (pph_file_header *) stream->file_data;
+  header = (pph_file_header *) stream->encoder.r.file_data;
   strtab = (const char *) header + sizeof (pph_file_header);
   strtab_size = header->strtab_size;
   body = strtab + strtab_size;
-  gcc_assert (stream->file_size >= strtab_size + sizeof (pph_file_header));
-  body_size = stream->file_size - strtab_size - sizeof (pph_file_header);
+  gcc_assert (stream->encoder.r.file_size
+ >= strtab_size + sizeof (pph_file_header));
+  body_size = stream->encoder.r.file_size
+ - strtab_size - sizeof (pph_file_header);
 
   /* Create an input block structure pointing right after the string
  table.  */
-  stream->ib = XCNEW (struct lto_input_block);
-  LTO_INIT_INPUT_BLOCK_PTR (stream->ib, body, 0, body_size);
-  stream->data_in = lto_data_in_create (stream->pph_sections[0], strtab,
-strtab_size, NULL);
+  stream->encoder.r.ib = XCNEW (struct lto_input_block);
+  LTO_INIT_INPUT_BLOCK_PTR (stream->encoder.r.ib, body, 0, body_size);
+  stream->en

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Tom de Vries
On 07/28/2011 06:25 PM, Richard Guenther wrote:
> On Thu, 28 Jul 2011, Tom de Vries wrote:
> 
>> On 07/28/2011 12:22 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 05:27 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 02:12 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 01:50 PM, Tom de Vries wrote:
> Hi Richard,
>
> I have a patch set for bug 43513 - The stack pointer is adjusted 
> twice.
>
> 01_pr43513.3.patch
> 02_pr43513.3.test.patch
> 03_pr43513.3.mudflap.patch
>
> The patch set has been bootstrapped and reg-tested on x86_64.
>
> I will sent out the patches individually.
>

 The patch replaces a vla __builtin_alloca that has a constant argument 
 with an
 array declaration.

 OK for trunk?
>>>
>>> I don't think it is safe to try to get at the VLA type the way you do.
>>
>> I don't understand in what way it's not safe. Do you mean I don't manage 
>> to find
>> the type always, or that I find the wrong type, or something else?
>
> I think you might get the wrong type,

 Ok, I'll review that code one more time.

> you also do not transform code
> like
>
>   int *p = alloca(4);
>   *p = 3;
>
> as there is no array type involved here.
>

 I was trying to stay away from non-vla allocas.  A source declared alloca 
 has
 function livetime, so we could have a single alloca in a loop, called 10 
 times,
 with all 10 instances live at the same time. This patch does not detect 
 such
 cases, and thus stays away from non-vla allocas. A vla decl does not have 
 such
 problems, the lifetime ends when it goes out of scope.
>>>
>>> Yes indeed - that probably would require more detailed analysis.
>>>
>>> In fact I would simply do sth like
>>>
>>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>>   n_elem = size * 8 / BITS_PER_UNIT;
>>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>>   var = create_tmp_var (array_type, NULL);
>>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>>>
>>
>> I tried this code on the example, and it works, but the newly declared 
>> type has
>> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>> This make
>> the memory access in the example potentially unaligned, which prohibits 
>> an
>> ivopts optimization, so the resulting text size is 68 instead of the 64 
>> achieved
>> with my current patch.
>
> Ok, so then set DECL_ALIGN of the variable to something reasonable
> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> alignment that the targets alloca function would guarantee.
>

 I tried that, but that doesn't help. It's the alignment of the type that
 matters, not of the decl.
>>>
>>> It shouldn't.  All accesses are performed with the original types and
>>> alignment comes from that (plus the underlying decl).
>>>
>>
>> I managed to get it all working by using build_aligned_type rather that 
>> DECL_ALIGN.
> 
> That's really odd, DECL_ALIGN should just work - nothing refers to the
> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
> 1 maybe?
> 

This doesn't work either.

  /* Declare array.  */
  elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
  n_elem = size * 8 / BITS_PER_UNIT;
  align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
  array_type = build_array_type_nelts (elem_type, n_elem);
  var = create_tmp_var (array_type, NULL);
  DECL_ALIGN (var) = align;
  DECL_USER_ALIGN (var) = 1;

Maybe this clarifies it:

Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
/home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
(gdb) call debug_generic_expr (ref)
MEM[(int[0:D.2579] *)&D.2595][0]
(gdb) call debug_generic_expr (step)
4

1627  base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
(gdb) call debug_generic_expr (base)
D.2595

1629  base_type = TREE_TYPE (base);
(gdb) call debug_generic_expr (base_type)
[40]

1630  base_align = TYPE_ALIGN (base_type);
(gdb) p base_align
$1 = 8

So the align is 8-bits, and we return true here:

(gdb) n
1632  if (mode != BLKmode)
(gdb) n
1634  unsigned mode_align = GET_MODE_ALIGNMENT (mode);
(gdb)
1636  if (base_align < mode_align
(gdb)
1639return true;


Here we can see that the base actually has the (user) align on it:

(gdb) call debug_tree (base)
 
unit size 
align 8 symtab 0 alias set -1 canonical type 0xf7e1b2a0 precision 8
min  max 
pointer_to

Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-28 Thread Tobias Grosser

On 07/23/2011 12:01 AM, Sebastian Pop wrote:

The CLAST produced by CLooG-ISL contains an assignment and GCC chokes
on it.  The exact CLAST contains an assignment followed by an if:

scat_1 = max(0,ceild(T_4-7,8));
if (scat_1<= min(1,floord(T_4-1,8))) {
   S7(scat_1);
}

This is equivalent to a loop that iterates only once, and so CLooG
generates an assignment followed by an if instead of a loop.  This is
an important optimization that was improved in ISL, that allows
if-conversion: imagine GCC having to figure out that a loop like the
following actually iterates only once, and can be converted to an if:

for (scat_1 = max(0,ceild(T_4-7,8)); scat_1<= min(1,floord(T_4-1,8)); scat_1++)
   S7(scat_1);

This patch implements the translation of CLAST assignments.
Bootstrapped and tested on amd64-linux.


Hi Sebastian,

thanks for adding this to graphite. One comment inline.



Sebastian

2011-07-22  Sebastian Pop

PR middle-end/48648
* graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
CLAST assignments.
(translate_clast): Same.
(translate_clast_assignment): New.

* gcc.dg/graphite/id-pr48648.c: New.
---
  gcc/ChangeLog  |8 
  gcc/graphite-clast-to-gimple.c |   49 
  gcc/testsuite/ChangeLog|5 +++
  gcc/testsuite/gcc.dg/graphite/id-pr48648.c |   21 
  4 files changed, 83 insertions(+), 0 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/id-pr48648.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9cfa21b..303c9c9 100644
+/* Translates a clast assignment STMT to gimple.
+
+   - NEXT_E is the edge where new generated code should be attached.
+   - BB_PBB_MAPPING is is a basic_block and it's related poly_bb_p mapping.  */
+
+static edge
+translate_clast_assignment (struct clast_assignment *stmt, edge next_e,
+   int level, ivs_params_p ip)
+{
+  gimple_seq stmts;
+  mpz_t v1, v2;
+  tree type, new_name, var;
+  edge res = single_succ_edge (split_edge (next_e));
+  struct clast_expr *expr = (struct clast_expr *) stmt->RHS;
+  struct clast_user_stmt *body
+= clast_get_body_of_loop ((struct clast_stmt *) stmt);


I am not a big fan of using clast_get_body_of_loop as it is buggy. 
Introducing new uses of it, is nothing what I would support. Do we 
really need this?



+  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);
+
+  mpz_init (v1);
+  mpz_init (v2);


What about some more meaningful names like bound_one, bound_two?


+  type = type_for_clast_expr (expr, ip, v1, v2);
+  var = create_tmp_var (type, "graphite_var");
+  new_name = force_gimple_operand (clast_to_gcc_expression (type, expr, ip),
+   &stmts, true, var);
+  add_referenced_var (var);
+  if (stmts)
+{
+  gsi_insert_seq_on_edge (next_e, stmts);
+  gsi_commit_edge_inserts ();
+}
+
+  compute_bounds_for_level (pbb, level, v1, v2);


Mh. I do not completely understand all the code. But can't we get v1 and 
v2 set without the need for the compute_bounds_for_level function. Is 
the type_for_clast_expression not setting them.


Cheers
Tobi


[PATCH, PR 49886] Prevent fnsplit from changing signature when there are type attributes

2011-07-28 Thread Martin Jambor
Hi,

pass_split_functions is happy to split functions which have type
attributes but cannot update them if the new clone has in any way
different parameters than the original.  This can lead to
miscompilations in cases like the testcase.

This patch solves it by 1) making the inliner set the
can_change_signature flag to false for them because their signature
cannot be changed (this step is also necessary to make IPA-CP operate
on them and handle them correctly), and 2) make the splitting pass
keep all parameters if the flag is set.  The second step might involve
inventing some default definitions if the parameters did not really
have any.

I spoke about this with Honza and he claimed that the new function is
really an entirely different thing and that the parameters may
correspond only very loosely and thus the type attributes should be
cleared.  I'm not sure I agree, but in any case I needed this to work
to allow me continue with promised IPA-CP polishing and so I decided
to do this because it was easier.  (My own opinion is that the current
representation of parameter-describing function type attributes is
evil and will cause harm no matter hat we do.)

A very similar patch has passed bootstrap and testsuite on
x86_64-linux, the current one is undergoing both right now.  OK for
trunk if it passes?

Thanks,

Martin



2011-07-28  Martin Jambor  

PR middle-end/49886
* ipa-inline-analysis.c (compute_inline_parameters): Set
can_change_signature of noes with typde attributes.
* ipa-split.c (split_function): Do not skip any arguments if
can_change_signature is set.

* testsuite/gcc.c-torture/execute/pr49886.c: New testcase.

Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -1658,18 +1658,24 @@ compute_inline_parameters (struct cgraph
   /* Can this function be inlined at all?  */
   info->inlinable = tree_inlinable_function_p (node->decl);
 
-  /* Inlinable functions always can change signature.  */
-  if (info->inlinable)
-node->local.can_change_signature = true;
+  /* Type attributes can use parameter indices to describe them.  */
+  if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
+node->local.can_change_signature = false;
   else
 {
-  /* Functions calling builtin_apply can not change signature.  */
-  for (e = node->callees; e; e = e->next_callee)
-   if (DECL_BUILT_IN (e->callee->decl)
-   && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
-   && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
- break;
-  node->local.can_change_signature = !e;
+  /* Otherwise, inlinable functions always can change signature.  */
+  if (info->inlinable)
+   node->local.can_change_signature = true;
+  else
+   {
+ /* Functions calling builtin_apply can not change signature.  */
+ for (e = node->callees; e; e = e->next_callee)
+   if (DECL_BUILT_IN (e->callee->decl)
+   && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
+   && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
+ break;
+ node->local.can_change_signature = !e;
+   }
 }
   estimate_function_body_sizes (node, early);
 
Index: src/gcc/ipa-split.c
===
--- src.orig/gcc/ipa-split.c
+++ src/gcc/ipa-split.c
@@ -945,10 +945,10 @@ static void
 split_function (struct split_point *split_point)
 {
   VEC (tree, heap) *args_to_pass = NULL;
-  bitmap args_to_skip = BITMAP_ALLOC (NULL);
+  bitmap args_to_skip;
   tree parm;
   int num = 0;
-  struct cgraph_node *node;
+  struct cgraph_node *node, *cur_node = cgraph_get_node 
(current_function_decl);
   basic_block return_bb = find_return_bb ();
   basic_block call_bb;
   gimple_stmt_iterator gsi;
@@ -968,17 +968,30 @@ split_function (struct split_point *spli
   dump_split_point (dump_file, split_point);
 }
 
+  if (cur_node->local.can_change_signature)
+args_to_skip = BITMAP_ALLOC (NULL);
+  else
+args_to_skip = NULL;
+
   /* Collect the parameters of new function and args_to_skip bitmap.  */
   for (parm = DECL_ARGUMENTS (current_function_decl);
parm; parm = DECL_CHAIN (parm), num++)
-if (!is_gimple_reg (parm)
-   || !gimple_default_def (cfun, parm)
-   || !bitmap_bit_p (split_point->ssa_names_to_pass,
- SSA_NAME_VERSION (gimple_default_def (cfun, parm
+if (args_to_skip
+   && (!is_gimple_reg (parm)
+   || !gimple_default_def (cfun, parm)
+   || !bitmap_bit_p (split_point->ssa_names_to_pass,
+ SSA_NAME_VERSION (gimple_default_def (cfun,
+   parm)
   bitmap_set_bit (args_to_skip, num);
 else
   {
arg 

Re: [PATCH 1/3] Fix PR47653: do not handle loops using wrapping semantics in graphite

2011-07-28 Thread Tobias Grosser

On 07/24/2011 08:25 AM, Sebastian Pop wrote:

2011-07-23  Sebastian Pop

PR middle-end/47653
* graphite-scop-detection.c (graphite_can_represent_loop): Discard
loops using wrapping semantics.

* gcc.dg/graphite/run-id-pr47653.c: New.
* gcc.dg/graphite/interchange-3.c: Do not use unsigned types for
induction variables.
* gcc.dg/graphite/scop-16.c: Same.
* gcc.dg/graphite/scop-17.c: Same.
* gcc.dg/graphite/scop-21.c: Same.
---
  gcc/ChangeLog  |6 ++
  gcc/graphite-scop-detection.c  |   18 +-
  gcc/testsuite/ChangeLog|   10 ++
  gcc/testsuite/gcc.dg/graphite/interchange-3.c  |2 +-
  gcc/testsuite/gcc.dg/graphite/run-id-pr47653.c |   17 +
  gcc/testsuite/gcc.dg/graphite/scop-16.c|2 +-
  gcc/testsuite/gcc.dg/graphite/scop-17.c|2 +-
  gcc/testsuite/gcc.dg/graphite/scop-21.c|2 +-
  .../testsuite/libgomp.graphite/force-parallel-1.c  |2 +-
  9 files changed, 47 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/run-id-pr47653.c


Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-28 Thread Tobias Grosser

On 07/27/2011 06:20 PM, Jack Howarth wrote:

On Fri, Jul 22, 2011 at 01:00:09AM +0200, Tobias Grosser wrote:

Hi,

I propose to switch to the official cloog.org cloog version with isl backend and
at the same time to remove support for both CLooG-PPL legacy as well as
CLooG-Parma.

We want to switch to cloog-isl as it is the only officially maintained version
of cloog. Furthermore, it provides features that will help to fix some bugs in
the graphite code generation[1].
The reason to abond CLooG-PPL (legacy version) is, that cloog-isl provides the
new CloogInput library interface. This interface is not available the old CLooG.
I plan to move graphite to this interface. As I do not see enough benefits from
being able to use CLooG PPL, I decided to not introduce any compatibility
scheme, but just remove any code that is only needed for CLooG-PPL.
I also removed CLooG-Parma (cloog.org with PPL backend), as it is currently not
actively maintained and not well tested. I believe our time is better spent on
improving graphite or cloog isl, as in putting time into this cloog version.

So here we are: Moving graphite back to the official cloog.org version!

Passes 'make check RUNTESTFLAGS=graphite.exp' as well as a bootstrap on Linux
amd64.

Cheers
Tobi


Tobi,
Are there any additional plans for gcc 4.7? In particular, wasn't the 
-fgraphite-identity
option supported to be enabled at -O3 by defaulting ftree-loop-linear on which 
is now an alias
of -floop-interchange since gcc 4.6?
   Jack


Hi Jack,

I personally do not have any fixed plans for something that needs to be 
in 4.7. Here is a list of open topics, that will be fixed if time allows 
(or someone funds my time):


- Remove dependency to PPL (as Richi mentioned)
- Integrate region based scop detection
- Use isl scheduling (Pluto like) to automatically tile the code
- Integrate Konrad's data flow patches (if he submits them)

Cheers
Tobi


Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-28 Thread Tobias Grosser

On 07/26/2011 08:30 PM, Sebastian Pop wrote:

On Fri, Jul 22, 2011 at 07:32, Joseph S. Myers  wrote:

On Fri, 22 Jul 2011, Tobias Grosser wrote:


I propose to switch to the official cloog.org cloog version with isl backend and
at the same time to remove support for both CLooG-PPL legacy as well as
CLooG-Parma.


Where are the install.texi changes in this patch series?


Please see the attached patch.


Thanks Sebastian.

Can you take care of uploading cloog-0.16.3 to the gcc ftp site?

Cheers
Tobi


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Richard Henderson
On 07/27/2011 03:35 AM, Torvald Riegel wrote:
> patch7: gtm_transaction::decide_begin_dispatch() gets the transaction
> properties from the caller instead of reading from
> gtm_transaction::prop, which might not have been updated by the caller
> yet.
> 
> patch8: Fix nesting level reset when rolling back the outermost
> transaction. Before, this was incorrectly reset to zero, which caused
> transactions to not get committed. This didn't show up in previous
> testing because previously, only serial-mode TM methods were available.
> 
> OK for branch, together with the previous patches?

Both ok.


r~


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Richard Henderson
> New erase method and placement new for aatree.
> 
> * aatree.h (aa_tree::remove): New.
> (aa_tree::operator new): Add placement new.

Ok.

> Change pr_hasElse to the value specified in the ABI.
> 
> * libitm.h (_ITM_codeProperties): Change pr_hasElse to the ABI's 
> value.

Ok.

> Add information to dispatch about closed nesting and uninstrumented code.
> 
> * dispatch.h (GTM::abi_dispatch): Add can_run_uninstrumented_code 
> and
> closed_nesting flags, as well as a closed nesting alternative.
> * method-serial.cc: Same.

Nearly...

> +  virtual abi_dispatch* closed_nesting_alternative()
> +  {
> +// For nested transactions with an instrumented code path, we can do
> +// undo logging.
> +return GTM::dispatch_serial();

Surely you really mean dispatch_serial_ul here?
Otherwise ok.

> Use vector instead of list to store user actions.
> 
>   * useraction.cc: Use vector instead of list to store actions.
>   Also support partial rollbacks for closed nesting.
>   * libitm_i.h (GTM::gtm_transaction::user_action): Same.
>   * beginend.cc: Same.

Ok.

> Add closed nesting as restart reason.
> 
>   * libitm_i.h: Add closed nesting as restart reason.
>   * retry.cc (GTM::gtm_transaction::decide_retry_strategy): Same.

Ok, except

> +  if (r == RESTART_CLOSED_NESTING) retry_serial = true;

Coding style.  THEN statement on the next line, even for small THEN.

> Make flat nesting the default, use closed nesting on demand.
> 
> * local.cc (gtm_transaction::rollback_local): Support closed 
> nesting.
> * eh_cpp.cc (GTM::gtm_transaction::revert_cpp_exceptions): Same.
>   * dispatch.h: Same.
>   * method-serial.cc: Same.
> * beginend.cc (GTM::gtm_transaction::begin_transaction): Change to
> flat nesting as default, and closed nesting on demand.
> (GTM::gtm_transaction::rollback): Same.
> (_ITM_abortTransaction): Same.
> (GTM::gtm_transaction::restart): Same.
> (GTM::gtm_transaction::trycommit): Same.
> (GTM::gtm_transaction::trycommit_and_finalize): Removed.
> (choose_code_path): New.
> (GTM::gtm_transaction_cp::save): New.
> (GTM::gtm_transaction_cp::commit): New.
> * query.cc (_ITM_inTransaction): Support flat nesting.
> * libitm_i.h (GTM::gtm_transaction_cp): New helper struct for 
> nesting.
> (GTM::gtm_transaction): Support flat and closed nesting.
> * alloc.cc (commit_allocations_2): New.
> (commit_cb_data): New helper struct.
> (GTM::gtm_transaction::commit_allocations): Handle nested
> commits/rollbacks.
> * libitm.texi: Update user action section, add description of 
> nesting.

Nearly...

> +  abi_dispatch *cn_disp = disp->closed_nesting_alternative();
> +  if (cn_disp)
> +{
> +  disp = cn_disp;
> +  set_abi_disp(disp);
> +}

Don't we need to fini the old disp?  Seems there's a leak here, though
not visible until we re-instate the non-serial methods.

> +  if (!(tx->state & STATE_IRREVOCABLE)) ret |= a_saveLiveVariables;

Coding style.



r~


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 8:59 AM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 7:59 AM, Uros Bizjak  wrote:
>> On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:
>>
 In x32, thread pointer is 32bit and choice of segment register for the
 thread base ptr load should be based on TARGET_64BIT.  This patch
 implements it.  OK for trunk?
>>>
>>> -ENOTESTCASE.
>>>
>>
>> There is no standalone testcase.  The symptom is in glibc build, I
>> got
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path 
>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>> Segmentation fault
>> make[5]: *** Waiting for unfinished jobs
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>> Segmentation fault
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>> Segmentation fault
>>
>> since thread pointer is 32bit in x32.
>>
>
> If we load thread pointer (fs segment register) in x32 with 64bit
> load, the upper 32bits are garbage.
> We must load 32bit

 So, instead of huge complications with new mode iterator, just
 introduce two new patterns that will shadow existing ones for
 TARGET_X32.

 Like in attached (untested) patch.

>>>
>>> I tried the following patch with typos fixed.  It almost worked,
>>> except for this failure in glibc testsuite:
>>>
>>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
>>> failed
>>> make[4]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>>> Error 1
>>>
>>> I will add:
>>>
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index 8723dc5..d32d64d 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>>  {
>>>   rtx tp, reg, insn;
>>>
>>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  if (ptr_mode != Pmode)
>>> +    tp = convert_to_mode (Pmode, tp, 1);
>>>   if (!to_reg)
>>>     return tp;
>>>
>>> since TP must be 32bit.
>>
>> No, this won't have the desired effect. It will change the UNSPEC, so
>> it won't match patterns in i386.md.
>>
>> Can you debug the failure a bit more? With my patterns, add{l} and
>> mov{l} should clear top 32bits.
>>
>
> TP is 32bit in x32  For load_tp_x32, we load SImode value and
> zero-extend to DImode. For add_tp_x32, we are adding SImode
> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
> must take SImode TP.
>

I will see what I can do.


-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 7:59 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:
>
>>> In x32, thread pointer is 32bit and choice of segment register for the
>>> thread base ptr load should be based on TARGET_64BIT.  This patch
>>> implements it.  OK for trunk?
>>
>> -ENOTESTCASE.
>>
>
> There is no standalone testcase.  The symptom is in glibc build, I
> got
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path 
> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
> Segmentation fault
> make[5]: *** Waiting for unfinished jobs
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
> Segmentation fault
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
> Segmentation fault
>
> since thread pointer is 32bit in x32.
>

 If we load thread pointer (fs segment register) in x32 with 64bit
 load, the upper 32bits are garbage.
 We must load 32bit
>>>
>>> So, instead of huge complications with new mode iterator, just
>>> introduce two new patterns that will shadow existing ones for
>>> TARGET_X32.
>>>
>>> Like in attached (untested) patch.
>>>
>>
>> I tried the following patch with typos fixed.  It almost worked,
>> except for this failure in glibc testsuite:
>>
>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" failed
>> make[4]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>> Error 1
>>
>> I will add:
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 8723dc5..d32d64d 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>  {
>>   rtx tp, reg, insn;
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>   if (!to_reg)
>>     return tp;
>>
>> since TP must be 32bit.
>
> No, this won't have the desired effect. It will change the UNSPEC, so
> it won't match patterns in i386.md.
>
> Can you debug the failure a bit more? With my patterns, add{l} and
> mov{l} should clear top 32bits.
>

TP is 32bit in x32  For load_tp_x32, we load SImode value and
zero-extend to DImode. For add_tp_x32, we are adding SImode
value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
must take SImode TP.

-- 
H.J.


Re: [PATCH] Disable size optimizations of -gdwarf-2 DW_AT_data_member_location DW_OP_plus_uconst

2011-07-28 Thread Jason Merrill

I'd find the logic easier to read if it were inverted, but OK.

Jason


Re: [PATCH] Fix -gdwarf-3 DW_AT_data_member_location for >= 64KB offsets (PR debug/49871)

2011-07-28 Thread Jason Merrill

OK.

Jason


Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)

2011-07-28 Thread Richard Henderson
On 07/28/2011 07:59 AM, Georg-Johann Lay wrote:
> So it appears that IRA is not as smart as we thought and not
> prepared for this...
> 
> Or did I do something fundamentally wrong?

It sure doesn't look like you've done anything wrong.


r~


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Thu, 28 Jul 2011, Tom de Vries wrote:

> On 07/28/2011 12:22 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> > 
> >> On 07/27/2011 05:27 PM, Richard Guenther wrote:
> >>> On Wed, 27 Jul 2011, Tom de Vries wrote:
> >>>
>  On 07/27/2011 02:12 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> >
> >> On 07/27/2011 01:50 PM, Tom de Vries wrote:
> >>> Hi Richard,
> >>>
> >>> I have a patch set for bug 43513 - The stack pointer is adjusted 
> >>> twice.
> >>>
> >>> 01_pr43513.3.patch
> >>> 02_pr43513.3.test.patch
> >>> 03_pr43513.3.mudflap.patch
> >>>
> >>> The patch set has been bootstrapped and reg-tested on x86_64.
> >>>
> >>> I will sent out the patches individually.
> >>>
> >>
> >> The patch replaces a vla __builtin_alloca that has a constant argument 
> >> with an
> >> array declaration.
> >>
> >> OK for trunk?
> >
> > I don't think it is safe to try to get at the VLA type the way you do.
> 
>  I don't understand in what way it's not safe. Do you mean I don't manage 
>  to find
>  the type always, or that I find the wrong type, or something else?
> >>>
> >>> I think you might get the wrong type,
> >>
> >> Ok, I'll review that code one more time.
> >>
> >>> you also do not transform code
> >>> like
> >>>
> >>>   int *p = alloca(4);
> >>>   *p = 3;
> >>>
> >>> as there is no array type involved here.
> >>>
> >>
> >> I was trying to stay away from non-vla allocas.  A source declared alloca 
> >> has
> >> function livetime, so we could have a single alloca in a loop, called 10 
> >> times,
> >> with all 10 instances live at the same time. This patch does not detect 
> >> such
> >> cases, and thus stays away from non-vla allocas. A vla decl does not have 
> >> such
> >> problems, the lifetime ends when it goes out of scope.
> > 
> > Yes indeed - that probably would require more detailed analysis.
> > 
> > In fact I would simply do sth like
> >
> >   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
> >   n_elem = size * 8 / BITS_PER_UNIT;
> >   array_type = build_array_type_nelts (elem_type, n_elem);
> >   var = create_tmp_var (array_type, NULL);
> >   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
> >
> 
>  I tried this code on the example, and it works, but the newly declared 
>  type has
>  an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>  This make
>  the memory access in the example potentially unaligned, which prohibits 
>  an
>  ivopts optimization, so the resulting text size is 68 instead of the 64 
>  achieved
>  with my current patch.
> >>>
> >>> Ok, so then set DECL_ALIGN of the variable to something reasonable
> >>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> >>> alignment that the targets alloca function would guarantee.
> >>>
> >>
> >> I tried that, but that doesn't help. It's the alignment of the type that
> >> matters, not of the decl.
> > 
> > It shouldn't.  All accesses are performed with the original types and
> > alignment comes from that (plus the underlying decl).
> > 
> 
> I managed to get it all working by using build_aligned_type rather that 
> DECL_ALIGN.

That's really odd, DECL_ALIGN should just work - nothing refers to the
type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
1 maybe?

> 
> >> So should we try to find the base type of the vla, and use that, or use the
> >> nonstandard char type?
> > 
> > I don't think we can reliably find the base type of the vla - well,
> > in practice we may because we control how we lower VLAs during
> > gimplification, but nothing in the IL constraints say that the
> > resulting pointer type should be special.
> > 
> > Using a char[] decl shouldn't be a problem IMHO.
> > 
> > And obviously you lose the optimization we arrange with inserting
> > __builtin_stack_save/restore pairs that way - stack space will no
> > longer be shared for subsequent VLAs.  Which means that you'd
> > better limit the size you allow this promotion.
> >
> 
>  Right, I could introduce a parameter for this.
> >>>
> >>> I would think you could use PARAM_LARGE_STACK_FRAME for now and say,
> >>> allow a size of PARAM_LARGE_STACK_FRAME / 10?
> >>>
> >>
> >> That unfortunately is too small for the example from bug report. The 
> >> default
> >> value of the param is 250, so that would be a threshold of 25, and the 
> >> alloca
> >> size of the example is 40.  Perhaps we can try a threshold of
> >> PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such?
> > 
> > Hm.  estimated_stack_size is not O(1), so no.  I think we need to
> > find a sensible way of allowing stack sharing.  Eventually Michas
> > patch for introducing points-of-death would help here, if we'd
> > go for folding this during stack-save/resto

Re: Unreviewed libgcc patches

2011-07-28 Thread Rainer Orth
NightStrike  writes:

> On Mon, Jul 18, 2011 at 8:21 AM, Rainer Orth
>  wrote:
>> The following two libgcc patches have seen almost no comments, and
>> certainly neither testing or review in a week:
>>
>>        CFT: [build] Move fp-bit support to toplevel libgcc
>>        http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00927.html
>>
>>        CFT: [build] Move soft-fp support to toplevel libgcc
>>        http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00931.html
>>
>> This patch will need to be updated for the recent addition of the c6x
>> port.
>>
>> Both will probably need build and libgcc maintainers and either a bunch
>> of target maintainers or a global reviewer.  I wonder how to proceed
>> here: I've got a bunch of further libgcc patches in the works or
>> planned, but if I can't get them reviewed, there's no point in
>> continuing that work.
>
> Do you still need support?

No: I've since received review comments for both and am working to
incorporate them and complete the remaining libgcc build-related
patches.

What would help is target maintainers actually reviewing or trying the
patches.  I'll probably wait until I've (re-)posted the full set and
start a new CFT then.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR49876: Continue code generation with integer_zero_node on gloog_error

2011-07-28 Thread Sebastian Pop
On Thu, Jul 28, 2011 at 09:49, Richard Guenther  wrote:
> And it's always integers or pointers only?  Otherwise you'd probably
> want build_zero_cst (type) instead.

Ok, I started regstrapping again with build_zero_cst.

Thanks,
Sebastian


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:

>> In x32, thread pointer is 32bit and choice of segment register for the
>> thread base ptr load should be based on TARGET_64BIT.  This patch
>> implements it.  OK for trunk?
>
> -ENOTESTCASE.
>

 There is no standalone testcase.  The symptom is in glibc build, I
 got

 CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
  -E -x c-header'
 /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
 --library-path 
 /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
 ../scripts -h rpcsvc/yppasswd.x -o
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
 Segmentation fault
 make[5]: *** Waiting for unfinished jobs
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
 Segmentation fault
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
 Segmentation fault

 since thread pointer is 32bit in x32.

>>>
>>> If we load thread pointer (fs segment register) in x32 with 64bit
>>> load, the upper 32bits are garbage.
>>> We must load 32bit
>>
>> So, instead of huge complications with new mode iterator, just
>> introduce two new patterns that will shadow existing ones for
>> TARGET_X32.
>>
>> Like in attached (untested) patch.
>>
>
> I tried the following patch with typos fixed.  It almost worked,
> except for this failure in glibc testsuite:
>
> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
> -c -f $charmap -i $input ${common_objpfx}localedata/$out
> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" failed
> make[4]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
> Error 1
>
> I will add:
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8723dc5..d32d64d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>  {
>   rtx tp, reg, insn;
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>   if (!to_reg)
>     return tp;
>
> since TP must be 32bit.

No, this won't have the desired effect. It will change the UNSPEC, so
it won't match patterns in i386.md.

Can you debug the failure a bit more? With my patterns, add{l} and
mov{l} should clear top 32bits.

I'd also argue that there is something wrong with glibc. It should
initialize %fs with a zero-extended value, so original 64bit
load_tp/add_tp patterns could be used.

Uros.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 6:40 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 3:24 PM, H.J. Lu  wrote:
>
> In x32, thread pointer is 32bit and choice of segment register for the
> thread base ptr load should be based on TARGET_64BIT.  This patch
> implements it.  OK for trunk?

 -ENOTESTCASE.

>>>
>>> There is no standalone testcase.  The symptom is in glibc build, I
>>> got
>>>
>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>  -E -x c-header'
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>> --library-path 
>>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>> ../scripts -h rpcsvc/yppasswd.x -o
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>>> Segmentation fault
>>> make[5]: *** Waiting for unfinished jobs
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>>> Segmentation fault
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>>> Segmentation fault
>>>
>>> since thread pointer is 32bit in x32.
>>>
>>
>> If we load thread pointer (fs segment register) in x32 with 64bit
>> load, the upper 32bits are garbage.
>> We must load 32bit
>
> So, instead of huge complications with new mode iterator, just
> introduce two new patterns that will shadow existing ones for
> TARGET_X32.
>
> Like in attached (untested) patch.
>

We can't just shadow them. They have to be disabled for x32.
I am testing this.


-- 
H.J.
---
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index aaaf53a..9191b98 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12452,10 +12452,21 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])

 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
-  ""
+  "!TARGET_X32"
   "mov{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "imov")
(set_attr "modrm" "0")
@@ -12463,12 +12474,25 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])

+(define_insn "*add_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (plus:DI (unspec:DI [(const_int 0)] UNSPEC_TP)
+(match_operand:DI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
(match_operand:P 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
-  ""
+  "!TARGET_X32"
   "add{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "alu")
(set_attr "modrm" "0")


[PATCH, ARM] Fix broken testcase, vfp-1.c, for Thumb

2011-07-28 Thread Ian Bolton
This patch makes the vfp-1.c testcase work for Thumb.  It became broken when
we
restricted the negative offsets allowed for Thumb to fix up a Spec2K failure
some months back.  (It was previously possible to generate illegal offsets.)

OK for trunk?


Cheers,
Ian


2011-07-28  Ian Bolton  

testsuite/
* gcc.target/arm/vfp-1.c: large negative offsets not possible on
Thumb2.



Index: gcc/testsuite/gcc.target/arm/vfp-1.c
===
--- gcc/testsuite/gcc.target/arm/vfp-1.c(revision 176838)
+++ gcc/testsuite/gcc.target/arm/vfp-1.c(working copy)
@@ -127,13 +127,13 @@ void test_convert () {
 
 void test_ldst (float f[], double d[]) {
   /* { dg-final { scan-assembler "flds.+ \\\[r0, #1020\\\]" } } */
-  /* { dg-final { scan-assembler "flds.+ \\\[r0, #-1020\\\]" } } */
+  /* { dg-final { scan-assembler "flds.+ \\\[r\[0-9\], #-1020\\\]" { target
{ arm32 && { ! arm_thumb2_ok } } } } } */
   /* { dg-final { scan-assembler "add.+ r0, #1024" } } */
-  /* { dg-final { scan-assembler "fsts.+ \\\[r0, #0\\\]\n" } } */
+  /* { dg-final { scan-assembler "fsts.+ \\\[r\[0-9\], #0\\\]\n" } } */
   f[256] = f[255] + f[-255];
 
   /* { dg-final { scan-assembler "fldd.+ \\\[r1, #1016\\\]" } } */
-  /* { dg-final { scan-assembler "fldd.+ \\\[r1, #-1016\\\]" } } */
+  /* { dg-final { scan-assembler "fldd.+ \\\[r\[1-9\], #-1016\\\]" { target
{ arm32 && { ! arm_thumb2_ok } } } } } */
   /* { dg-final { scan-assembler "fstd.+ \\\[r1, #256\\\]" } } */
   d[32] = d[127] + d[-127];
 }





  1   2   >