Re: empty range in pop_heap

2011-12-04 Thread Markus Trippelsdorf
On 2011.12.03 at 15:35 +, Jonathan Wakely wrote:
> On 12 November 2011 15:14, Jonathan Wakely wrote:
> > On 12 November 2011 15:04, Marc Glisse wrote:
> >>
> >> Debug-mode seems to check that first,last is a valid range, is a heap, but
> >> not that it is not empty. Maybe it could?
> >
> > Good idea, thanks.  I'll change that.
> 
> As promised.
> 
> * include/debug/macros.h (__glibcxx_check_non_empty_range): Define.
> * include/debug/debug.h (__glibcxx_requires_non_empty_range): Define.
> * include/debug/formatter.h (__msg_non_empty_range): Add.
> * src/debug.cc: Message text for __msg_non_empty_range.
> * include/bits/stl_heap.h (pop_heap): Check for non-empty range.
> * testsuite/25_algorithms/pop_heap/empty_neg.cc: New.
> 
> Tested x86_64-linux, committed to trunk.

Thanks Jonathan.

You forgot to change the second one with the comparison functor.

diff --git a/libstdc++-v3/include/bits/stl_heap.h 
b/libstdc++-v3/include/bits/stl_heap.h
index ed7750c..af62525 100644
--- a/libstdc++-v3/include/bits/stl_heap.h
+++ b/libstdc++-v3/include/bits/stl_heap.h
@@ -360,6 +360,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // concept requirements
   __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
_RandomAccessIterator>)
+  __glibcxx_requires_non_empty_range(__first, __last);
   __glibcxx_requires_valid_range(__first, __last);
   __glibcxx_requires_heap_pred(__first, __last, __comp);
 

-- 
Markus


Re: Out-of-order update of new_spill_reg_store[]

2011-12-04 Thread Richard Sandiford
Back to this...

Bernd Schmidt  writes:
>> gcc/
>>  * reload1.c (reload_regs_reach_end_p): Replace with...
>>  (reload_reg_rtx_reaches_end_p): ...this function.
>>  (new_spill_reg_store): Update commentary.
>>  (emit_input_reload_insns): Don't clear new_spill_reg_store here.
>>  (emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
>>  before setting new_spill_reg_store.
>>  (emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
>>  Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
>>  Also use reload_reg_rtx_reaches_end_p when recording inheritance
>>  information for non-spill reload registers.
>
> Just an update to say that based on our discussion I think the general
> approach is OK, but I'm still trying to figure out what exactly this
> piece of code is doing, and whether the changes to it make sense:
>
>> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
>>   the storing insn so that we may delete this insn with
>>   delete_output_reload.  */
>>src_reg = reload_reg_rtx_for_output[r];
>> -
>> -  /* If this is an optional reload, try to find the source reg
>> - from an input reload.  */
>> -  if (! src_reg)
>> +  if (src_reg
>> +  && reload_reg_rtx_reaches_end_p (src_reg, r))
>> +store_insn = new_spill_reg_store[REGNO (src_reg)];
>> +  else
>>  {
>> +  /* If this is an optional reload, try to find the
>> + source reg from an input reload.  */
>>rtx set = single_set (insn);
>>if (set && SET_DEST (set) == rld[r].out)
>>  {
>>int k;
>> +  rtx cand;
>>  
>>src_reg = SET_SRC (set);
>>store_insn = insn;
>>for (k = 0; k < n_reloads; k++)
>> -{
>> -  if (rld[k].in == src_reg)
>> -{
>> -  src_reg = reload_reg_rtx_for_input[k];
>> -  break;
>> -}
>> -}
>> +if (rld[k].in == src_reg)
>> +  {
>> +cand = reload_reg_rtx_for_input[k];
>> +if (reload_reg_rtx_reaches_end_p (cand, k))
>> +  {
>> +src_reg = cand;
>> +break;
>> +  }
>> +  }
>>  }
>>  }
>> -  else
>> -store_insn = new_spill_reg_store[REGNO (src_reg)];
>>if (src_reg && REG_P (src_reg)
>>&& REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
>>  {

Yeah, I was in two minds what to do here.  AIUI, the code:

  if (!HARD_REGISTER_NUM_P (out_regno))
{
  rtx src_reg, store_insn = NULL_RTX;

  reg_last_reload_reg[out_regno] = 0;

  /* If we can find a hard register that is stored, record
 the storing insn so that we may delete this insn with
 delete_output_reload.  */
  src_reg = reload_reg_rtx_for_output[r];

  /* If this is an optional reload, try to find the source reg
 from an input reload.  */
  if (! src_reg)
{
  rtx set = single_set (insn);
  if (set && SET_DEST (set) == rld[r].out)
{
  int k;

[A]   src_reg = SET_SRC (set);
  store_insn = insn;
  for (k = 0; k < n_reloads; k++)
{
  if (rld[k].in == src_reg)
{
[B]   src_reg = reload_reg_rtx_for_input[k];
  break;
}
}
}
}
  else
[C] store_insn = new_spill_reg_store[REGNO (src_reg)];
  if (src_reg && REG_P (src_reg)
  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
{
  ...record inheritance for out <- src_reg...

is coping with three cases:

 [C] we have an output reload for a non-spill register.  E.g. the
 pre-reload instruction might be:

(set (reg P1) ( (reg H1) ...))
 REG_DEAD: H1

 (H = hard register, P = pseudo register), where the P1 and H1
 operands have matching constraints.  In this case we might use
 H1 as the reload register for (reg P1).

 This is the case that really does need reload_reg_rtx_reaches_end_p.
 E.g. the SET above could be in parallel with another SET that needs
 a secondary output reload.  It's possible in principle for H1 to be
 used as the s

Ping: Shrink-wrapping vs. EXIT_IGNORE_STACK

2011-12-04 Thread Richard Sandiford
Ping for:

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02449.html

which fixes a wrong-code bug in reorg.c for MIPS.

Richard


Fix ICE in call to out-of-line __sync_lock_test_and_set

2011-12-04 Thread Richard Sandiford
I think I must have messed up my previous testing of MIPS16 __sync stuff.
This patch fixes one bit of fallout, as seen in ia64-sync-1.c.  If the
result of sync_lock_test_and_set is unused, the optab helper routine
maybe_emit_sync_lock_test_and_set gets called with a "target" of const0_rtx.
If the operation expands to a library call, maybe_emit_sync_lock_test_and_set
in turn passes const0_rtx down as the required target of the call.
We then get an ICE when moving the return register into const0_rtx.

The same problem applies to the atomic_ version.

Everywhere else in optabs.c just passes a null call target, so it seemed
simplest to do the same here.

Tested on mips64-linux-gnu.  OK to install?

Richard


gcc/
* optabs.c (maybe_emit_sync_lock_test_and_set): Pass a null target
to emit_library_call_value.
(expand_atomic_compare_and_swap): Likewise.

Index: gcc/optabs.c
===
--- gcc/optabs.c2011-12-03 16:19:48.0 +
+++ gcc/optabs.c2011-12-04 08:52:12.0 +
@@ -7400,7 +7400,7 @@ maybe_emit_sync_lock_test_and_set (rtx t
  rtx addr;
 
  addr = convert_memory_address (ptr_mode, XEXP (mem, 0));
- return emit_library_call_value (libfunc, target, LCT_NORMAL,
+ return emit_library_call_value (libfunc, NULL_RTX, LCT_NORMAL,
  mode, 2, addr, ptr_mode,
  val, mode);
}
@@ -7637,7 +7637,7 @@ expand_atomic_compare_and_swap (rtx *pta
   if (libfunc != NULL)
 {
   rtx addr = convert_memory_address (ptr_mode, XEXP (mem, 0));
-  target_oval = emit_library_call_value (libfunc, target_oval, LCT_NORMAL,
+  target_oval = emit_library_call_value (libfunc, NULL_RTX, LCT_NORMAL,
 mode, 3, addr, ptr_mode,
 expected, mode, desired, mode);
 


[committed] PR51351: misnamed ior sync routines

2011-12-04 Thread Richard Sandiford
[Sorry Kaz, there might be some overlapping work here.  I wrote the
 patch before realising there was a PR about it.]

This patch fixes link failures in various atomic tests on MIPS16.
The internal optab names use "ior" to be consistent with the GCC
rtl convention, but the external names use "or" instead.

Tested on mips64-linux-gnu.  Applied as obvious.

Richard


gcc/
PR middle-end/51351
* optabs.c (init_sync_libfuncs): Use "or" rather than "ior"
in the external names.

Index: gcc/optabs.c
===
--- gcc/optabs.c2011-12-03 16:24:44.0 +
+++ gcc/optabs.c2011-12-03 16:25:02.0 +
@@ -6624,14 +6624,14 @@ init_sync_libfuncs (int max)
 
   init_sync_libfuncs_1 (sync_old_add_optab, "__sync_fetch_and_add", max);
   init_sync_libfuncs_1 (sync_old_sub_optab, "__sync_fetch_and_sub", max);
-  init_sync_libfuncs_1 (sync_old_ior_optab, "__sync_fetch_and_ior", max);
+  init_sync_libfuncs_1 (sync_old_ior_optab, "__sync_fetch_and_or", max);
   init_sync_libfuncs_1 (sync_old_and_optab, "__sync_fetch_and_and", max);
   init_sync_libfuncs_1 (sync_old_xor_optab, "__sync_fetch_and_xor", max);
   init_sync_libfuncs_1 (sync_old_nand_optab, "__sync_fetch_and_nand", max);
 
   init_sync_libfuncs_1 (sync_new_add_optab, "__sync_add_and_fetch", max);
   init_sync_libfuncs_1 (sync_new_sub_optab, "__sync_sub_and_fetch", max);
-  init_sync_libfuncs_1 (sync_new_ior_optab, "__sync_ior_and_fetch", max);
+  init_sync_libfuncs_1 (sync_new_ior_optab, "__sync_or_and_fetch", max);
   init_sync_libfuncs_1 (sync_new_and_optab, "__sync_and_and_fetch", max);
   init_sync_libfuncs_1 (sync_new_xor_optab, "__sync_xor_and_fetch", max);
   init_sync_libfuncs_1 (sync_new_nand_optab, "__sync_nand_and_fetch", max);


[testsuite] Adding missing dg-require-profiling directives

2011-12-04 Thread Richard Sandiford
Several profiling tests fail for MIPS16.  The problem is that MIPS has
native TLS support, but the ABI has not "yet" been extended to MIPS16.
MIPS16 is supposed to be link-compatible with non-MIPS16, so we can't
use emultls, and must simply say sorry().

This patch adds dg-require-profiling to the affected tests.  The reason
I haven't just applied it as obvious is that dg-require-profiling really
seems to be a test for link-time and runtime support.  There are presumably
targets that can't link profiling code but that are nevertheless happily
compiling the tests below.  So do we want to split the directive into two?
I ask the question while hoping the answer is "no". :-)

Tested on mips64-linux-gnu.  The tests still run for normal MIPS,
but are skipped for MIPS16.

Richard


gcc/testsuite/
* g++.dg/debug/pr46338.C: Add dg-require-profiling.
* g++.dg/torture/pr39732.C: Likewise.
* g++.dg/torture/pr40642.C: Likewise.
* gcc.c-torture/compile/pr44686.c: Likewise.
* gcc.dg/20050309-1.c: Likewise.
* gcc.dg/20050330-2.c: Likewise.
* gcc.dg/20051201-1.c: Likewise.
* gcc.dg/gomp/pr27573.c: Likewise.
* gcc.dg/pr46255.c: Likewise.
* gcc.dg/profile-dir-1.c: Likewise.
* gcc.dg/profile-dir-2.c: Likewise.
* gcc.dg/profile-dir-3.c: Likewise.
* gcc.dg/profile-generate-1.c: Likewise.
* gfortran.dg/gomp/pr27573.f90: Likewise.
* gcc.dg/profile-generate-3.c: Be specific about the type of
profiling required.

Index: gcc/testsuite/g++.dg/debug/pr46338.C
===
--- gcc/testsuite/g++.dg/debug/pr46338.C2011-12-04 08:52:27.0 
+
+++ gcc/testsuite/g++.dg/debug/pr46338.C2011-12-04 11:24:50.0 
+
@@ -1,5 +1,6 @@
 // PR debug/46338
 // { dg-do compile }
+// { dg-require-profiling "-fprofile-generate" }
 // { dg-options "-O -fprofile-generate -fcompare-debug" }
 
 void bar ();
Index: gcc/testsuite/g++.dg/torture/pr39732.C
===
--- gcc/testsuite/g++.dg/torture/pr39732.C  2011-12-04 08:52:27.0 
+
+++ gcc/testsuite/g++.dg/torture/pr39732.C  2011-12-04 11:24:50.0 
+
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-profiling "-fprofile-generate" } */
 /* { dg-options "-fprofile-generate" } */
 
 template struct char_traits;
Index: gcc/testsuite/g++.dg/torture/pr40642.C
===
--- gcc/testsuite/g++.dg/torture/pr40642.C  2011-12-04 08:52:27.0 
+
+++ gcc/testsuite/g++.dg/torture/pr40642.C  2011-12-04 11:24:50.0 
+
@@ -1,4 +1,5 @@
 // { dg-do compile }
+/* { dg-require-profiling "-fprofile-generate" } */
 // { dg-options "-fprofile-generate" }
 
 // GCC used to ICE with some EH edge missing.
Index: gcc/testsuite/gcc.c-torture/compile/pr44686.c
===
--- gcc/testsuite/gcc.c-torture/compile/pr44686.c   2011-12-04 
08:52:27.0 +
+++ gcc/testsuite/gcc.c-torture/compile/pr44686.c   2011-12-04 
11:24:50.0 +
@@ -1,3 +1,4 @@
+/* { dg-require-profiling "-fprofile-generate" } */
 /* { dg-options "-fipa-pta -fprofile-generate" } */
 void *
 memcpy (void *a, const void *b, __SIZE_TYPE__ len)
Index: gcc/testsuite/gcc.dg/20050309-1.c
===
--- gcc/testsuite/gcc.dg/20050309-1.c   2011-12-04 08:52:27.0 +
+++ gcc/testsuite/gcc.dg/20050309-1.c   2011-12-04 11:24:50.0 +
@@ -2,6 +2,7 @@
output reloads.  */
 
 /* { dg-do compile } */
+/* { dg-require-profiling "-fprofile-generate" } */
 /* { dg-options "-O2 -fprofile-generate" } */
 
 char *
Index: gcc/testsuite/gcc.dg/20050330-2.c
===
--- gcc/testsuite/gcc.dg/20050330-2.c   2011-12-04 08:52:27.0 +
+++ gcc/testsuite/gcc.dg/20050330-2.c   2011-12-04 11:24:50.0 +
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-profiling "-fprofile-generate" } */
 /* { dg-options "-O2 -fprofile-generate" } */
 
 struct S
Index: gcc/testsuite/gcc.dg/20051201-1.c
===
--- gcc/testsuite/gcc.dg/20051201-1.c   2011-12-04 08:52:27.0 +
+++ gcc/testsuite/gcc.dg/20051201-1.c   2011-12-04 11:24:50.0 +
@@ -2,6 +2,7 @@
tree_flow_call_edges_add.  */
 
 /* { dg-do compile } */
+/* { dg-require-profiling "-fprofile-generate" } */
 /* { dg-options "-O1 -fprofile-generate -Wno-attributes" } */
 
 static __attribute__ ((always_inline)) void 
Index: gcc/testsuite/gcc.dg/gomp/pr27573.c
===
--- gcc/testsuite/gcc.dg/gomp/pr27573.c 2011-12-04 08:52:27.0 +
+++ gcc/testsuite/gcc.dg/gomp/pr27573.c 2011-

Re: [Patch] Increase array sizes in vect-tests to enable 256-bit vectorization

2011-12-04 Thread Richard Guenther
On Sat, Dec 3, 2011 at 5:54 PM, Michael Zolotukhin
 wrote:
>> I mean, that, when 256-bit vectorization is enabled we still use 128bit
>> vectorization if the arrays are too short for 256bit vectorization.  You'll
>> lose this test coverage when you change the array sizes.
> That's true, but do we need all these test both with short and long
> arrays? We could have the tests with increased sizes and compile them
> with/without use of avx, thus testing both 256- and 128- bit
> vectorization. Additionally, we might want to add several tests with
> short arrays to check what happens if 256-bit is available, but arrays
> is too short for it. I mean we don't need to duplicate all of the
> tests to check this situation.

Well, initially those tests served as a way to prove that dual-size
vectorization
works.  You should not remove this testing functionality.

Richard.

> On 3 December 2011 18:31, Richard Guenther  wrote:
>> On Fri, Dec 2, 2011 at 6:39 PM, Michael Zolotukhin
>>  wrote:

 Shouldn't we add a variant for each testcase so that we still
 excercise both 128-bit and 256-bit vectorization paths?
>>>
>>> These tests are still good to test 128-bit vectorization, the changes
>>> was made just to make sure that 256-bit vectorization is possible on
>>> the tests.
>>>
>>> Actually, It's just first step in enabling these tests for 256 bits -
>>> for now many of them are failing if '-mavx' or '-mavx2' is specified
>>> (mostly due to different diagnostics messages produced by vectorizer),
>>> but with original (small) sizes of arrays we couldn't even check that.
>>> When they are enabled, it'll be possible to use them for testing both
>>> 128- and 256- bit vectorization.
>>
>> I mean, that, when 256-bit vectorization is enabled we still use 128bit
>> vectorization if the arrays are too short for 256bit vectorization.  You'll
>> lose this test coverage when you change the array sizes.
>>
>> Richard.
>>
>>> Michael
>>>
>>>
>>> 2011/12/2 Richard Guenther :
 2011/12/2 Michael Zolotukhin :
> Hi,
>
> This patch increases array sizes in tests from vect.exp suite, thus
> enabling 256-bit vectorization where it's available.
>
> Ok for trunk?

 Shouldn't we add a variant for each testcase so that we still
 excercise both 128-bit and 256-bit vectorization paths?

> Changelog:
> 2011-12-02  Michael Zolotukhin  
>
>        * gcc.dg/vect/slp-13.c: Increase array size, add initialization.
>        * gcc.dg/vect/slp-24.c: Ditto.
>        * gcc.dg/vect/slp-3.c: Likewise and fix scans.
>        * gcc.dg/vect/slp-34.c: Ditto.
>        * gcc.dg/vect/slp-4.c: Ditto.
>        * gcc.dg/vect/slp-cond-2.c: Ditto.
>        * gcc.dg/vect/slp-multitypes-11.c: Ditto.
>        * gcc.dg/vect/vect-1.c: Ditto.
>        * gcc.dg/vect/vect-10.c: Ditto.
>        * gcc.dg/vect/vect-105.c: Ditto.
>        * gcc.dg/vect/vect-112.c: Ditto.
>        * gcc.dg/vect/vect-15.c: Ditto.
>        * gcc.dg/vect/vect-2.c: Ditto.
>        * gcc.dg/vect/vect-31.c: Ditto.
>        * gcc.dg/vect/vect-32.c: Ditto.
>        * gcc.dg/vect/vect-33.c: Ditto.
>        * gcc.dg/vect/vect-34.c: Ditto.
>        * gcc.dg/vect/vect-35.c: Ditto.
>        * gcc.dg/vect/vect-36.c: Ditto.
>        * gcc.dg/vect/vect-6.c: Ditto.
>        * gcc.dg/vect/vect-73.c: Ditto.
>        * gcc.dg/vect/vect-74.c: Ditto.
>        * gcc.dg/vect/vect-75.c: Ditto.
>        * gcc.dg/vect/vect-76.c: Ditto.
>        * gcc.dg/vect/vect-80.c: Ditto.
>        * gcc.dg/vect/vect-85.c: Ditto.
>        * gcc.dg/vect/vect-89.c: Ditto.
>        * gcc.dg/vect/vect-97.c: Ditto.
>        * gcc.dg/vect/vect-98.c: Ditto.
>        * gcc.dg/vect/vect-all.c: Ditto.
>        * gcc.dg/vect/vect-double-reduc-6.c: Ditto.
>        * gcc.dg/vect/vect-iv-8.c: Ditto.
>        * gcc.dg/vect/vect-iv-8a.c: Ditto.
>        * gcc.dg/vect/vect-outer-1.c: Ditto.
>        * gcc.dg/vect/vect-outer-1a.c: Ditto.
>        * gcc.dg/vect/vect-outer-1b.c: Ditto.
>        * gcc.dg/vect/vect-outer-2.c: Ditto.
>        * gcc.dg/vect/vect-outer-2a.c: Ditto.
>        * gcc.dg/vect/vect-outer-2c.c: Ditto.
>        * gcc.dg/vect/vect-outer-3.c: Ditto.
>        * gcc.dg/vect/vect-outer-3a.c: Ditto.
>        * gcc.dg/vect/vect-outer-4a.c: Ditto.
>        * gcc.dg/vect/vect-outer-4b.c: Ditto.
>        * gcc.dg/vect/vect-outer-4c.c: Ditto.
>        * gcc.dg/vect/vect-outer-4d.c: Ditto.
>        * gcc.dg/vect/vect-outer-4m.c: Ditto.
>        * gcc.dg/vect/vect-outer-fir-lb.c: Ditto.
>        * gcc.dg/vect/vect-outer-fir.c: Ditto.
>        * gcc.dg/vect/vect-over-widen-1.c: Ditto.
>        * gcc.dg/vect/vect-over-widen-2.c: Ditto.
>        * gcc.dg/vect/vect-over-widen-3.c: Ditto.
>        * gcc.dg/vect/vect-over-widen-4.c: Ditto.
>        * gcc.dg/vect/vect-reduc-1char.c: Dit

Re: Wrong parameter type for _mm256_insert_epi64 in avxintrin.h

2011-12-04 Thread Uros Bizjak
Hello!

> Attached is a patch which fixes bug target/51393:
>
>   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51393
>
> Also attached, avx_bug.c is a minimal example to reproduce the bug
> (requires an AVX-capable CPU):
>
>   $ gcc -O3 -mavx avx_bug.c
>   $ ./a.out 0x8000
>   in  = 0x8000
>   out = 0x8000
>
> The correct output should be:
>
>   $ ./a.out 0x8000
>   in  = 0x8000
>   out = 0x8000
>
> As explained in the bug report, it's just a matter of the second
> parameter of _mm256_insert_epi64 being declared as int where it should
> be long long (avxintrin.h:762). A simple typo, trivially fixed by the
> attached patch.

OK.

Attached patch (with a testcase) was committed to mainline SVN with
following ChangeLog entry:

2011-12-04  Jérémie Detrey  

PR target/51393
* config/i386/avxintrin.h (_mm256_insert_epi64): Declare second
parameter as long long.

testsuite/ChangeLog:

2011-12-04  Uros Bizjak  
Jérémie Detrey  

PR target/51393
* gcc.target/i386/pr51393.c: New test.

Patch was tested on x86_64-pc-linux-gnu {,-m32}. Patch was committed
to mainline, will be committed to release branches.

Thanks,
Uros.
Index: config/i386/avxintrin.h
===
--- config/i386/avxintrin.h (revision 181984)
+++ config/i386/avxintrin.h (working copy)
@@ -759,7 +759,7 @@
 
 #ifdef __x86_64__
 extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
-_mm256_insert_epi64 (__m256i __X, int __D, int const __N)
+_mm256_insert_epi64 (__m256i __X, long long __D, int const __N)
 {
   __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 1);
   __Y = _mm_insert_epi64 (__Y, __D, __N % 2);
Index: testsuite/gcc.target/i386/pr51393.c
===
--- testsuite/gcc.target/i386/pr51393.c (revision 0)
+++ testsuite/gcc.target/i386/pr51393.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do run { target { ! { ia32 } } } } */
+/* { dg-require-effective-target avx } */
+/* { dg-options "-O -mavx" } */
+
+#include "avx-check.h"
+#include 
+
+static void
+__attribute__((noinline))
+avx_test (void)
+{
+  long long in = 0x8ll;
+  long long out;
+
+  __m256i zero = _mm256_setzero_si256();
+  __m256i tmp  = _mm256_insert_epi64 (zero, in, 0);
+  out = _mm256_extract_epi64(tmp, 0);
+
+  if (in != out)
+abort ();
+}


Add a prepare_pch_save target hook

2011-12-04 Thread Richard Sandiford
A while back I added the target_globals structure, to allow a backend
to switch between two very different ISA modes without paying the full
target_reinit penalty each time.  This made a huge difference to compile
time, but had a drawback: the target_globals structure contained both
GGC and non-GGC data.  This meant that secondary target_globals structures
like mips16_globals couldn't be saved correctly in PCH files.

I'm afraid I was aware of this at the time, but decided to go ahead
anyway on the basis that compiling MIPS16 code in reasonable time was
more important than creating MIPS16 PCH files.  But after um-ing and
ah-ing about the best fix, I've finally settled on the patch below.

Tested on mips64-linux-gnu, where it fixes many PCH failures for MIPS16.
OK to install?

Richard


gcc/
* doc/tm.texi.in (TARGET_PREPARE_PCH_SAVE): New hook.
* doc/tm.texi: Regenerate.
* target.def (prepare_pch_save): New hook.
* c-family/c-pch.c (c_common_write_pch): Call it.
* config/mips/mips.c (was_mips16_pch_p): Delete.
(mips_set_mips16_mode): Don't refer to was_mips16_pch_p.
(mips_prepare_pch_save): New function.
(TARGET_PREPARE_PCH_SAVE): Define.

Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  2011-12-04 08:52:33.0 +
+++ gcc/doc/tm.texi.in  2011-12-04 11:30:12.0 +
@@ -9974,6 +9974,8 @@ of @code{target_flags}.  @var{pch_flags}
 value is the same as for @code{TARGET_PCH_VALID_P}.
 @end deftypefn
 
+@hook TARGET_PREPARE_PCH_SAVE
+
 @node C++ ABI
 @section C++ ABI parameters
 @cindex parameters, c++ abi
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi 2011-12-04 08:52:33.0 +
+++ gcc/doc/tm.texi 2011-12-04 11:30:12.0 +
@@ -10079,6 +10079,13 @@ of @code{target_flags}.  @var{pch_flags}
 value is the same as for @code{TARGET_PCH_VALID_P}.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_PREPARE_PCH_SAVE (void)
+Called before writing out a PCH file.  If the target has some
+garbage-collected data that needs to be in a particular state on PCH loads,
+it can use this hook to enforce that state.  Very few targets need
+to do anything here.
+@end deftypefn
+
 @node C++ ABI
 @section C++ ABI parameters
 @cindex parameters, c++ abi
Index: gcc/target.def
===
--- gcc/target.def  2011-12-04 08:52:33.0 +
+++ gcc/target.def  2011-12-04 11:30:12.0 +
@@ -1819,6 +1819,15 @@ DEFHOOK
  const char *, (const void *data, size_t sz),
  default_pch_valid_p)
 
+DEFHOOK
+(prepare_pch_save,
+ "Called before writing out a PCH file.  If the target has some\n\
+garbage-collected data that needs to be in a particular state on PCH loads,\n\
+it can use this hook to enforce that state.  Very few targets need\n\
+to do anything here.",
+ void, (void),
+ hook_void_void)
+
 /* If nonnull, this function checks whether a PCH file with the
given set of target flags can be used.  It returns NULL if so,
otherwise it returns an error message.  */
Index: gcc/c-family/c-pch.c
===
--- gcc/c-family/c-pch.c2011-12-04 08:52:33.0 +
+++ gcc/c-family/c-pch.c2011-12-04 11:30:12.0 +
@@ -180,6 +180,8 @@ c_common_write_pch (void)
 
   timevar_push (TV_PCH_SAVE);
 
+  targetm.prepare_pch_save ();
+
   (*debug_hooks->handle_pch) (1);
 
   cpp_write_pch_deps (parse_in, pch_outfile);
Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2011-12-04 08:52:32.0 +
+++ gcc/config/mips/mips.c  2011-12-04 12:01:44.0 +
@@ -15197,14 +15197,8 @@ mips_output_mi_thunk (FILE *file, tree t
 }
 
 /* The last argument passed to mips_set_mips16_mode, or negative if the
-   function hasn't been called yet.
-
-   There are two copies of this information.  One is saved and restored
-   by the PCH process while the other is specific to this compiler
-   invocation.  The information calculated by mips_set_mips16_mode
-   is invalid unless the two variables are the same.  */
+   function hasn't been called yet.  */
 static int was_mips16_p = -1;
-static GTY(()) int was_mips16_pch_p = -1;
 
 /* Set up the target-dependent global state so that it matches the
current function's ISA mode.  */
@@ -15212,8 +15206,7 @@ static GTY(()) int was_mips16_pch_p = -1
 static void
 mips_set_mips16_mode (int mips16_p)
 {
-  if (mips16_p == was_mips16_p
-  && mips16_p == was_mips16_pch_p)
+  if (mips16_p == was_mips16_p)
 return;
 
   /* Restore base settings of various flags.  */
@@ -15310,7 +15303,6 @@ mips_set_mips16_mode (int mips16_p)
 restore_target_globals (&default_target_globals);
 
   was_mips16_p = mips16_p;
-  was_mips16_pch_p = 

Re: Find more shrink-wrapping opportunities

2011-12-04 Thread Richard Sandiford
Richard Sandiford  writes:
> Richard Sandiford  writes:
>> Bernd Schmidt  writes:
 The reason I'm suddenly "reviewing" the code now is that it
 doesn't prevent shrink-wrapping, because nothing adds register 2
 to the liveness info of the affected blocks.  The temporary prologue
 value of register 2 is then moved into register 15.
>>>
>>> Hmm. Are we just missing another df_analyze call?
>>
>> Well, if we do the kind of backwards walk I was thinking about (so that
>> we can handle chains), it might be better to update the liveness sets
>> we care about as we go.  A full df_analyze after each move would be
>> pretty expensive.
>
> FWIW, I've got a patch that I'll try to test this weekend.

OK, here it is.  As well as switching to the backward scan and incremental
liveness updates, I added a test for another case that I stumbled over:

  /* Reject targets of abnormal edges.  This is needed for correctness
 on ports like Alpha and MIPS, whose pic_offset_table_rtx can die on
 exception edges even though it is generally treated as call-saved
 for the majority of the compilation.  Moving across abnormal edges
 isn't going to be interesting for shrink-wrap usage anyway.  */
  if (live_edge->flags & EDGE_ABNORMAL)
return NULL;

The point is that when Alpha and MIPS have call-clobbered global pointers,
they say that the global pointer is call-saved and ensure that the call
patterns restore the gp where necessary.  These combined "call and load"
patterns get split after reload (or in MIPS's case, after prologue/
epilogue generation).  The ports then make sure that exception_receiver
also restores the gp.  The problem is that, if we insert a move from
pic_offset_table_rtx before exception_receiver, we end up moving the
wrong value.

Now all this is really a bit of a hack.  The information ought to be
made more explicit to the target-independent parts of the compiler.
That's going to be quite hard to fix though, and probably isn't stage 3
material.  So while I admit I don't like the test above, it feels like
the best fix for now.

Also, it seemed easiest to drop the single-register restriction at
the same time.  Hopefully this will help for things like DF moves
on 32-bit MIPS FPUs.

While moving bb_note to cfgrtl.c, I was tempted to make rtl_split_block
use it rather than first_insn_after_basic_block_note.  It isn't exactly
a one-for-one transformation though, at least not as far as null checks
go, so I'll leave it for a possible stage 1 cleanup.

Tested on mips64-linux-gnu.  OK to install?

Richard


gcc/
* sched-int.h (bb_note): Move to...
* basic-block.h: ...here.
* haifa-sched.c (bb_note): Move to...
* cfgrtl.c: ...here.
* function.c (next_block_for_reg): New function.
(move_insn_for_shrink_wrap): Likewise.
(prepare_shrink_wrap): Rewrite to use the above.

Index: gcc/sched-int.h
===
*** gcc/sched-int.h 2011-12-04 08:52:37.0 +
--- gcc/sched-int.h 2011-12-04 12:03:51.0 +
*** extern void sched_insns_init (rtx);
*** 130,136 
  extern void sched_insns_finish (void);
  
  extern void *xrecalloc (void *, size_t, size_t, size_t);
- extern rtx bb_note (basic_block);
  
  extern void reemit_notes (rtx);
  
--- 130,135 
Index: gcc/basic-block.h
===
*** gcc/basic-block.h   2011-12-04 08:52:37.0 +
--- gcc/basic-block.h   2011-12-04 12:03:51.0 +
*** extern void flow_edge_list_print (const
*** 801,806 
--- 801,807 
  
  /* In cfgrtl.c  */
  extern rtx block_label (basic_block);
+ extern rtx bb_note (basic_block);
  extern bool purge_all_dead_edges (void);
  extern bool purge_dead_edges (basic_block);
  extern bool fixup_abnormal_edges (void);
Index: gcc/haifa-sched.c
===
*** gcc/haifa-sched.c   2011-12-04 08:52:37.0 +
--- gcc/haifa-sched.c   2011-12-04 12:03:51.0 +
*** add_jump_dependencies (rtx insn, rtx jum
*** 6489,6508 
gcc_assert (!sd_lists_empty_p (jump, SD_LIST_BACK));
  }
  
- /* Return the NOTE_INSN_BASIC_BLOCK of BB.  */
- rtx
- bb_note (basic_block bb)
- {
-   rtx note;
- 
-   note = BB_HEAD (bb);
-   if (LABEL_P (note))
- note = NEXT_INSN (note);
- 
-   gcc_assert (NOTE_INSN_BASIC_BLOCK_P (note));
-   return note;
- }
- 
  /* Extend data structures for logical insn UID.  */
  void
  sched_extend_luids (void)
--- 6489,6494 
Index: gcc/cfgrtl.c
===
*** gcc/cfgrtl.c2011-12-04 08:52:37.0 +
--- gcc/cfgrtl.c2011-12-04 12:03:51.0 +
*** update_bb_for_insn (basic_block bb)
*** 500,505 
--- 500,519 
  }
  
  
+ /* Return the NOTE_INSN_BASIC_BLOCK of BB.  */
+ rtx
+ bb_note (basic_b

Re: [PATCH] Remove dead labels to increase superblock scope

2011-12-04 Thread Richard Sandiford
Tom de Vries  writes:
> OK, factored out delete_label now.
>
> Bootstrapped and reg-tested on x86_64.
>
> Ok for next stage1?

Looks good codewise.  I'm just a bit worried about the name "delete_label".
"delete_insn (label)" should always do the right thing for a pure deletion;
the point of the new routine is that it also moves instructions.
I'd prefer a name that differentiated it from delete_insn.  E.g.
"hide_label" or "decommission_label", although as you can tell
I'm useless at naming things...

Thanks,
Richard


Re: empty range in pop_heap

2011-12-04 Thread Jonathan Wakely
On 4 December 2011 09:37, Markus Trippelsdorf wrote:
>
> You forgot to change the second one with the comparison functor.

Doh! Thanks, fixed now.

* include/bits/stl_heap.h (pop_heap): Check for non-empty range in
overload taking a predicate.
* testsuite/25_algorithms/pop_heap/empty2_neg.cc: New.

Tested x86-64-linux, committed to trunk.
Index: include/bits/stl_heap.h
===
--- include/bits/stl_heap.h (revision 181986)
+++ include/bits/stl_heap.h (revision 181987)
@@ -361,6 +361,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
_RandomAccessIterator>)
   __glibcxx_requires_valid_range(__first, __last);
+  __glibcxx_requires_non_empty_range(__first, __last);
   __glibcxx_requires_heap_pred(__first, __last, __comp);
 
   --__last;
Index: testsuite/25_algorithms/pop_heap/empty2_neg.cc
===
--- testsuite/25_algorithms/pop_heap/empty2_neg.cc  (revision 0)
+++ testsuite/25_algorithms/pop_heap/empty2_neg.cc  (revision 181987)
@@ -0,0 +1,38 @@
+// Copyright (C) 2011 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without Pred the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// 25.3.6 Heap operations [lib.alg.heap.operations]
+
+// { dg-require-debug-mode "" }
+// { dg-do run { xfail *-*-* } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  int i = 0;
+  std::pop_heap(&i, &i, std::less());
+}
+
+int
+main()
+{
+  test01();
+  return 0;
+}


[patch] ARM: Fix miscompilation in arm.md:*minmax_arithsi. (PR target/51408)

2011-12-04 Thread Kazu_Hirata
Hi,

Attached is a patch to fix miscompilation in
arm.md:*minmax_arithsi.

The following testcase, reduced from efgcvt_r.c:fcvt_r in glibc, gets
miscompiled:

extern void abort (void);

int __attribute__((noinline))
foo (int a, int b)
{
  int max = (b > 0) ? b : 0;
  return max - a;
}

int
main (void)
{
  if (foo (3, -1) != -3)
abort ();
  return 0;
}

arm-none-eabi-gcc -O1 generates:

foo:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r1, #0
rsbge   r0, r0, r1
bx  lr

This would be equivalent to:

  return b >= 0 ? b - a : a;

which is different from:

  return b >= 0 ? b - a : -a;

That is, in assembly code, we should have an "else" clause like so:

cmp r1, #0
rsbge   r0, r0, r1  <- then clause
rsblt   r0, r0, #0  <- else clause
bx  lr

The problem comes from the fact that arm.md:*minmax_arithsi does not
add the "else" clause even though MINUS is not commutative.

The patch fixes the problem by always requiring the "else" clause in
the MINUS case.

Tested by running gcc testsuite on various ARM subarchitectures.  OK
to apply?

Kazu Hirata

gcc/
2011-12-04  Kazu Hirata  

PR target/51408
* config/arm/arm.md (*minmax_arithsi): Always require the else
clause in the MINUS case.

gcc/testsuite/
2011-12-04  Kazu Hirata  

PR target/51408
* gcc.dg/pr51408.c: New.

Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 181985)
+++ gcc/config/arm/arm.md   (working copy)
@@ -3413,7 +3413,7 @@
 bool need_else;
 
 if (which_alternative != 0 || operands[3] != const0_rtx
-|| (code != PLUS && code != MINUS && code != IOR && code != XOR))
+|| (code != PLUS && code != IOR && code != XOR))
   need_else = true;
 else
   need_else = false;
Index: gcc/testsuite/gcc.dg/pr51408.c
===
--- gcc/testsuite/gcc.dg/pr51408.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr51408.c  (revision 0)
@@ -0,0 +1,22 @@
+/* This testcase used to fail because of a bug in 
+   arm.md:*minmax_arithsi.  */
+
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+extern void abort (void);
+
+int __attribute__((noinline))
+foo (int a, int b)
+{
+  int max = (b > 0) ? b : 0;
+  return max - a;
+}
+
+int
+main (void)
+{
+  if (foo (3, -1) != -3)
+abort ();
+  return 0;
+}


[patch] Fix exit phi nodes creation for double reduction - PR 51285

2011-12-04 Thread Ira Rosen
Hi,

This patch adds a missing exit phi node for outer loop in
vectorization of double reduction.

Bootstrapped and tested on powerpc64-suse-linux.
Committed.

Ira

ChangeLog:

PR middle-end/51285
* tree-vect-loop.c (vect_create_epilog_for_reduction): Create exit
phi nodes for outer loop in case of double reduction.

testsuite/ChangeLog:

PR middle-end/51285
* gfortran.dg/vect/pr51285.f90: New test.

Index: tree-vect-loop.c
===
--- tree-vect-loop.c(revision 181984)
+++ tree-vect-loop.c(working copy)
@@ -3462,6 +3462,7 @@ vect_create_epilog_for_reduction (VEC (tree, heap)
   gimple use_stmt, orig_stmt, reduction_phi = NULL;
   bool nested_in_vect_loop = false;
   VEC (gimple, heap) *new_phis = NULL;
+  VEC (gimple, heap) *inner_phis = NULL;
   enum vect_def_type dt = vect_unknown_def_type;
   int j, i;
   VEC (tree, heap) *scalar_results = NULL;
@@ -3470,6 +3471,7 @@ vect_create_epilog_for_reduction (VEC (tree, heap)
   VEC (gimple, heap) *phis;
   bool slp_reduc = false;
   tree new_phi_result;
+  gimple inner_phi = NULL;

   if (slp_node)
 group_size = VEC_length (gimple, SLP_TREE_SCALAR_STMTS (slp_node));
@@ -3626,11 +3628,36 @@ vect_create_epilog_for_reduction (VEC (tree, heap)
 }

   /* The epilogue is created for the outer-loop, i.e., for the loop being
- vectorized.  */
+ vectorized.  Create exit phis for the outer loop.  */
   if (double_reduc)
 {
   loop = outer_loop;
   exit_bb = single_exit (loop)->dest;
+  inner_phis = VEC_alloc (gimple, heap, VEC_length (tree, vect_defs));
+  FOR_EACH_VEC_ELT (gimple, new_phis, i, phi)
+   {
+ gimple outer_phi = create_phi_node (SSA_NAME_VAR (PHI_RESULT (phi)),
+ exit_bb);
+ SET_PHI_ARG_DEF (outer_phi, single_exit (loop)->dest_idx,
+  PHI_RESULT (phi));
+ set_vinfo_for_stmt (outer_phi, new_stmt_vec_info (outer_phi,
+   loop_vinfo, NULL));
+ VEC_quick_push (gimple, inner_phis, phi);
+ VEC_replace (gimple, new_phis, i, outer_phi);
+ prev_phi_info = vinfo_for_stmt (outer_phi);
+  while (STMT_VINFO_RELATED_STMT (vinfo_for_stmt (phi)))
+{
+ phi = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (phi));
+ outer_phi = create_phi_node (SSA_NAME_VAR (PHI_RESULT (phi)),
+  exit_bb);
+ SET_PHI_ARG_DEF (outer_phi, single_exit (loop)->dest_idx,
+  PHI_RESULT (phi));
+ set_vinfo_for_stmt (outer_phi, new_stmt_vec_info (outer_phi,
+   loop_vinfo, NULL));
+ STMT_VINFO_RELATED_STMT (prev_phi_info) = outer_phi;
+ prev_phi_info = vinfo_for_stmt (outer_phi);
+   }
+   }
 }

   exit_gsi = gsi_after_labels (exit_bb);
@@ -4040,6 +4067,8 @@ vect_finalize_reduction:
 {
   epilog_stmt = VEC_index (gimple, new_phis, k / ratio);
   reduction_phi = VEC_index (gimple, reduction_phis, k / ratio);
+ if (double_reduc)
+   inner_phi = VEC_index (gimple, inner_phis, k / ratio);
 }

   if (slp_reduc)
@@ -4123,7 +4152,7 @@ vect_finalize_reduction:
  vs1 was created previously in this function by a call to
vect_get_vec_def_for_operand and is stored in
vec_initial_def;
- vs2 is defined by EPILOG_STMT, the vectorized EXIT_PHI;
+ vs2 is defined by INNER_PHI, the vectorized EXIT_PHI;
  vs0 is created here.  */

   /* Create vector phi node.  */
@@ -4144,7 +4173,7 @@ vect_finalize_reduction:
   add_phi_arg (vect_phi, vect_phi_init,
loop_preheader_edge (outer_loop),
UNKNOWN_LOCATION);
-  add_phi_arg (vect_phi, PHI_RESULT (epilog_stmt),
+  add_phi_arg (vect_phi, PHI_RESULT (inner_phi),
loop_latch_edge (outer_loop), UNKNOWN_LOCATION);
   if (vect_print_dump_info (REPORT_DETAILS))
 {
Index: testsuite/gfortran.dg/vect/pr51285.f90
===
--- testsuite/gfortran.dg/vect/pr51285.f90  (revision 0)
+++ testsuite/gfortran.dg/vect/pr51285.f90  (revision 0)
@@ -0,0 +1,36 @@
+! { dg-do compile }
+
+   SUBROUTINE smm_dnn_4_10_10_1_1_2_1(A,B,C)
+  REAL   :: C(4,10), B(10,10), A(4,10)
+  DO j=   1 ,  10 ,   2
+  DO i=   1 ,   4 ,   1
+  DO l=   1 ,  10 ,   1
+C(i+0,j+0)=C(i+0,j+0)+A(i+0,l+0)*B(l+0,j+0)
+C(i+0,j+1)=C(i+0,j+1)+A(i+0,l+0)*B(l+0,j+1)
+   

Re: [Patch, fortran] Fix PR 51338 - ICE with front-end optimization and assumed character lengths

2011-12-04 Thread Tobias Burnus

Thomas Koenig wrote:

Regression-tested. OK for trunk?

2011-11-29  Thomas Koenig 

PR fortran/51338
* dependency.c (are_identical_variables):  Handle case where
end fields of substring references are NULL.

2011-11-29  Thomas Koenig 

PR fortran/51338
* gfortran.dg/assumed_charlen_substring_1.f90:  New test.
+ /* This can only happen for assumed-length character arguments.
+If both are NULL, the end length compares equal, because we
+are looking at the same variable.  */
+ if (r1->u.ss.end == NULL&&  r2->u.ss.end == NULL)
+   break;


Well, it can also happen for deferred-length arguments; how about:

+ /* If both are NULL, the end length compares equal, because we
+are looking at the same variable. This can only happen for
+assumed- or deferred-length character arguments.  */


OK with that change.

Thanks for the patch and sorry for the slow review.

Tobias

PS: Patches which still need to be reviewed:
- http://gcc.gnu.org/ml/fortran/2011-11/msg00250.html - no 
-fcheck=bounds for character(LEN=:) to avoid ICE
- http://gcc.gnu.org/ml/fortran/2011-11/msg00253.html - (Re)enable 
warning if a function result variable is not set [4.4-4.7 diagnostics 
regression]
- http://gcc.gnu.org/ml/fortran/2011-12/msg00018.html - fix ASSOCIATE 
with extended types


Re: [PATCH] Remove dead labels to increase superblock scope

2011-12-04 Thread Eric Botcazou
> Looks good codewise.

Seconded, modulo the file: the function should be in cfgrtl.c instead.

> I'm just a bit worried about the name "delete_label". 
> "delete_insn (label)" should always do the right thing for a pure deletion;
> the point of the new routine is that it also moves instructions.

It only fixes things up though, so that the RTL stream is valid again.  Hence 
the question: why not retrofit it into delete_insn directly?

-- 
Eric Botcazou


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Eric Botcazou
> This introduces host-dependent code generation differences, right?
> You can simply use int64_t for code that is run on the host only.

Well, no, there is an entire file dedicated to this business (hwint.h).

-- 
Eric Botcazou


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Eric Botcazou
> I don't think that is related to C++ switch, because C++03 doesn't have
> long long, only C++11 and C99 has it.  We apparently are using int64_t or
> uint64_t in a couple of places already though:

IMHO they should be audited and fixed if need be.

> ada/tb-gcc.c:  uwx_get_reg ((struct uwx_env *) uw_context, 
> UWX_REG_IP, (uint64_t *) &pc);

This one is for the target only, so it's OK.

-- 
Eric Botcazou


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Richard Guenther
On Sun, Dec 4, 2011 at 4:31 PM, Eric Botcazou  wrote:
>> This introduces host-dependent code generation differences, right?
>> You can simply use int64_t for code that is run on the host only.
>
> Well, no, there is an entire file dedicated to this business (hwint.h).

No to what?  To the fact that HOST_WIDEST_INT is host-dependent
and thus should not be used to drive code generation?  Or no to the
fact that we can (and do) use int64_t as host integer type?

Richard.

> --
> Eric Botcazou


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Richard Guenther
On Sun, Dec 4, 2011 at 4:37 PM, Richard Guenther
 wrote:
> On Sun, Dec 4, 2011 at 4:31 PM, Eric Botcazou  wrote:
>>> This introduces host-dependent code generation differences, right?
>>> You can simply use int64_t for code that is run on the host only.
>>
>> Well, no, there is an entire file dedicated to this business (hwint.h).
>
> No to what?  To the fact that HOST_WIDEST_INT is host-dependent
> and thus should not be used to drive code generation?  Or no to the
> fact that we can (and do) use int64_t as host integer type?

Btw, I'd say we should document that hosting GCC on a platform that
does not support a 64bit integer type is no longer supported (that
includes a host compiler that exposes such type in some way).

Richard.

> Richard.
>
>> --
>> Eric Botcazou


Re: [Patch,AVR] Fix PR51409: error linking lto1 for target avr

2011-12-04 Thread Georg-Johann Lay

Georg-Johann Lay schrieb:

Denis Chertykov wrote:


Georg-Johann Lay:


I attached a patch but I fail to find the right configure options for
gcc/binutils as the testsuite complains

./avr/bin/ld: bad -plugin option

Maybe the patch can be pre-approved so that the others can proceed with their 
work?


Better to complete this work.

Denis.


http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02574.html

As this is a blocker and I am blocked myself by a collect2 issue:

Eric, Denis, could one of you test the patch and apply it if it is okay?
It is PR51409.

Applying is to issue for me, but running the tests with LTO enabled 
breaks anything that involves -flto because collect2 calls wrong linker, 
see link to gcc-help@ below.


As it appears I will have to debug/fix collect2 by myself which will 
take quite some time because I am not familiar with LTO/collect2.


Johann


I now switched back to --disable-lto as I could not resolve the problems that
appear to be a collect2 issue, see

http://gcc.gnu.org/ml/gcc-help/2011-12/msg00016.html

What I can do is:

* build the compiler with the patch and with LTO enabled and without
  getting a linker error for c_addr_space_name.

* I cannot get usable results from testsuite because of collect2 breakage

* Testsuite passes fine with the patch and --disable-lto [...]

Ok for trunk?

Johann

* config/avr/avr.h (ADDR_SPACE_PGM, ADDR_SPACE_PGM1,
ADDR_SPACE_PGM2, ADDR_SPACE_PGM3, ADDR_SPACE_PGM4,
ADDR_SPACE_PGM5, ADDR_SPACE_PGMX): Write as enum.
(avr_addrspace_t): New typedef.
(avr_addrspace): New declaration.
* config/avr/avr-c.c (avr_toupper): New static function.
(avr_register_target_pragmas, avr_cpu_cpp_builtins): Use
avr_addrspace to get address space information.
* config/avr/avr.c (avr_addrspace): New variable.
(avr_out_lpm, avr_pgm_check_var_decl, avr_insert_attributes,
avr_asm_named_section, avr_section_type_flags,
avr_asm_select_section, avr_addr_space_address_mode,
avr_addr_space_convert, avr_emit_movmemhi): Use it.
(avr_addr_space_pointer_mode): Forward to avr_addr_space_address_mode.
(avr_pgm_segment): Remove.


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Eric Botcazou
> No to what?  To the fact that HOST_WIDEST_INT is host-dependent
> and thus should not be used to drive code generation?  Or no to the
> fact that we can (and do) use int64_t as host integer type?

No to the fact that int64_t should be used (and the occurrences in the LTO code 
are OK).  hwint.h is precisely supposed to insulate the compiler from the host 
(of course we all know that this isn't 100% true) and HOST_WIDEST_INT is the 
proper type to be used in this case, see existing examples.

-- 
Eric Botcazou


Re: [Patch,AVR] Fix PR51409: error linking lto1 for target avr

2011-12-04 Thread Georg-Johann Lay

Georg-Johann Lay wrote:


http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02574.html

As this is a blocker and I am blocked myself by a collect2 issue:

Eric, Denis, could one of you test the patch and apply it if it is okay?
It is PR51409.


In addition, please add PR49868 to the ChangeLog. Thanks.

Johann

Applying is to issue for me, but running the tests with LTO enabled 
breaks anything that involves -flto because collect2 calls wrong linker, 
see link to gcc-help@ below.


As it appears I will have to debug/fix collect2 by myself which will 
take quite some time because I am not familiar with LTO/collect2.


I now switched back to --disable-lto as I could not resolve the 
problems that

appear to be a collect2 issue, see

http://gcc.gnu.org/ml/gcc-help/2011-12/msg00016.html

What I can do is:

* build the compiler with the patch and with LTO enabled and without
  getting a linker error for c_addr_space_name.

* I cannot get usable results from testsuite because of collect2 breakage

* Testsuite passes fine with the patch and --disable-lto [...]

Ok for trunk?

Johann

* config/avr/avr.h (ADDR_SPACE_PGM, ADDR_SPACE_PGM1,
ADDR_SPACE_PGM2, ADDR_SPACE_PGM3, ADDR_SPACE_PGM4,
ADDR_SPACE_PGM5, ADDR_SPACE_PGMX): Write as enum.
(avr_addrspace_t): New typedef.
(avr_addrspace): New declaration.
* config/avr/avr-c.c (avr_toupper): New static function.
(avr_register_target_pragmas, avr_cpu_cpp_builtins): Use
avr_addrspace to get address space information.
* config/avr/avr.c (avr_addrspace): New variable.
(avr_out_lpm, avr_pgm_check_var_decl, avr_insert_attributes,
avr_asm_named_section, avr_section_type_flags,
avr_asm_select_section, avr_addr_space_address_mode,
avr_addr_space_convert, avr_emit_movmemhi): Use it.
(avr_addr_space_pointer_mode): Forward to 
avr_addr_space_address_mode.

(avr_pgm_segment): Remove.




Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-04 Thread Jan Hubicka
> > No to what?  To the fact that HOST_WIDEST_INT is host-dependent
> > and thus should not be used to drive code generation?  Or no to the
> > fact that we can (and do) use int64_t as host integer type?
> 
> No to the fact that int64_t should be used (and the occurrences in the LTO 
> code 
> are OK).  hwint.h is precisely supposed to insulate the compiler from the 
> host 
> (of course we all know that this isn't 100% true) and HOST_WIDEST_INT is the 
> proper type to be used in this case, see existing examples.

Yep, most of the profiling code (where gcov_type is HOST_WIDEST_INT) is not
safe when compiled on host with 32bit ints only.  This was always considered
acceptable given that bootstrapped compiler will have long long and stage1
compiler won't excercise the limits.  This case seems same to me.  So I would
preffer the existing practice using HOST_WIDEST_INT. There is no need to have
some code using int64_t and other HOST_WIDEST_INT for same reason.

The actual patch is OK either with HOST_WIDEST_INT or in64_t based on where
this discussion will lead.

Honza
> 
> -- 
> Eric Botcazou


Re: [Patch,AVR] Fix PR51409: error linking lto1 for target avr

2011-12-04 Thread Denis Chertykov
2011/12/4 Georg-Johann Lay :
> Georg-Johann Lay wrote:
>>
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02574.html
>>
>> As this is a blocker and I am blocked myself by a collect2 issue:
>>
>> Eric, Denis, could one of you test the patch and apply it if it is okay?
>> It is PR51409.


I'm sorry, I can't because right now I'm debugging PR50925.
I spend all my free time to debug this bug.

Denis.


Re: [Patch, Fortran] PR 51383 - fix ASSOCIATE with extended types

2011-12-04 Thread Mikael Morin
On Saturday 03 December 2011 20:12:50 Tobias Burnus wrote:
> Another OOP-related patch: If one uses type extension, the first
> REF_COMPONENT does not necessarily refer directly to a component in the
> linked list starting at sym->ts.u.derived->components.
> 
> Using simply ref->u.c.component directly seems to work fine, thus, I do
> this with this patch.
> 
> Build and regtested on x86-64-linux.
> OK for the trunk?
> 
OK, thanks

Mikael


Re: Shrink-wrapping vs. EXIT_IGNORE_STACK

2011-12-04 Thread Eric Botcazou
>   * resource.c (init_resource_info): Only consider EXIT_IGNORE_STACK
>   if there is in epilogue.

OK, but keep the original order of the tests, i.e. EXIT_IGNORE_STACK is tested 
only after we know there is a frame pointer:

  if (!(frame_pointer_needed
&& EXIT_IGNORE_STACK
&& epilogue_insn
&& !current_function_sp_is_unchanging))
 SET_HARD_REG_BIT (end_of_function_needs.regs, STACK_POINTER_REGNUM);

-- 
Eric Botcazou


Re: [testsuite] Adding missing dg-require-profiling directives

2011-12-04 Thread Mike Stump
On Dec 4, 2011, at 3:29 AM, Richard Sandiford  
wrote:
> The problem is that MIPS has
> native TLS support, but the ABI has not "yet" been extended to MIPS16.
> MIPS16 is supposed to be link-compatible with non-MIPS16, so we can't
> use emultls, and must simply say sorry().
> 
> This patch adds dg-require-profiling to the affected tests.  The reason
> I haven't just applied it as obvious is that dg-require-profiling really
> seems to be a test for link-time and runtime support.  There are presumably
> targets that can't link profiling code but that are nevertheless happily
> compiling the tests below.  So do we want to split the directive into two?
> I ask the question while hoping the answer is "no". :-)

Hum...  I'd rather TLS support be defined and added for MIPS16...  I think we 
have enough targets with profiling and TLS that coverage won't be lost with 
your change.  I like simple.  If someone feels strongly about splitting, I'll 
pre-approve their change.  I think your patch is fine.  Ok.


Re: Add a prepare_pch_save target hook

2011-12-04 Thread Mike Stump
On Dec 4, 2011, at 4:02 AM, Richard Sandiford  
wrote:
> A while back I added the target_globals structure, to allow a backend
> to switch between two very different ISA modes without paying the full
> target_reinit penalty each time.  This made a huge difference to compile
> time, but had a drawback: the target_globals structure contained both
> GGC and non-GGC data.  This meant that secondary target_globals structures
> like mips16_globals couldn't be saved correctly in PCH files.

> +  /* We are called in a context where the current MIPS16 vs. non-MIPS16
> + setting should be irrelevant.  The question then is: which setting
> + makes most sense at load time?
> +
> + The PCH is loaded before the first token is read.  We should never
> + have switched into MIPS16 mode by that point,

If there is any way to say:

  #pragma mips16

to globally switch state into mips16 mode, then I believe the patch is wrong.  
The pch file can have that directive in it to put the compiler into that mode, 
with all the resulting state that mode entails.  Now, if you store that one bit 
into the pch file, say, via a GTY int flag, then on reload, you can examine 
that flag, and enter that state.  Bear in mind, this 'slows' pch load time.


[v3] Doxygen improvements for Type Traits and Utilities

2011-12-04 Thread Jonathan Wakely
This patch declares the Type Traits doxygen group in  not
in  *and* . It also makes sure
doxygen comments are attached to the actual trait such as
is_member_pointer, not a helper such as __is_member_pointer_helper.  I
added brief descriptions for some of the less self-explanatory traits
(enable_if, conditional, declval) as well as for std::forward and
altered the std::move comment to say it converts to an rvalue, since
it's not really true to say it moves a value (that depends what the
caller does with the result.)

* include/std/type_traits: Doxygen improvements.
* include/bits/move.h: Likewise.
* include/tr1/type_traits:  Likewise.
* include/tr2/type_traits:  Likewise.
* testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error
line numbers
* testsuite/20_util/forward/c_neg.cc: Likewise.
* testsuite/20_util/forward/f_neg.cc: Likewise.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc:
Likewise.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc:
Likewise.

Tested x86-64-linux with "make check" and "make doc-html" - committed to trunk.
Index: include/std/type_traits
===
--- include/std/type_traits (revision 181992)
+++ include/std/type_traits (revision 181993)
@@ -1,4 +1,4 @@
-// C++0x type_traits -*- C++ -*-
+// C++11 type_traits -*- C++ -*-
 
 // Copyright (C) 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
 //
@@ -42,7 +42,13 @@ namespace std _GLIBCXX_VISIBILITY(defaul
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /**
-   * @addtogroup metaprogramming
+   * @defgroup metaprogramming Metaprogramming and type traits
+   * @ingroup utilities
+   *
+   * Template utilities for compile-time introspection and modification,
+   * including type classification traits, type property inspection traits
+   * and type transformation traits.
+   *
* @{
*/
 
@@ -56,10 +62,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr operator value_type() { return value; }
 };
   
-  /// typedef for true_type
+  /// The type used as a compile-time boolean with true value.
   typedef integral_constant true_type;
 
-  /// typedef for false_type
+  /// The type used as a compile-time boolean with false value.
   typedef integral_constantfalse_type;
 
   template
@@ -451,7 +457,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_compound
 : public integral_constant::value> { };
 
-  /// is_member_pointer
   template
 struct __is_member_pointer_helper
 : public false_type { };
@@ -460,6 +465,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_member_pointer_helper<_Tp _Cp::*>
 : public true_type { };
 
+  /// is_member_pointer
   template
 struct is_member_pointer
 : public integral_constant
 { };
 
-  /// is_trivially_copyable (still unimplemented)
+  // is_trivially_copyable (still unimplemented)
 
   /// is_standard_layout
   template
@@ -564,6 +570,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct add_rvalue_reference;
 
+  /**
+   *  @brief  Utility to simplify expressions used in unevaluated operands
+   *  @ingroup utilities
+   */
   template
 typename add_rvalue_reference<_Tp>::type declval() noexcept;
 
@@ -1702,9 +1712,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
 
-  // Define a nested type if some predicate holds.
   // Primary template.
-  /// enable_if
+  /// Define a member typedef @c type only if a boolean constant is true.
   template
 struct enable_if 
 { };
@@ -1715,9 +1724,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { typedef _Tp type; };
 
 
-  // A conditional expression, but for types. If true, first, if false, second.
   // Primary template.
-  /// conditional
+  /// Define a member typedef @c type to one of two argument types.
   template
 struct conditional
 { typedef _Iftrue type; };
@@ -1747,14 +1755,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 common_type::type, _Vp...>::type type;
 };
 
-  /// underlying_type
+  /// The underlying type of an enum.
   template
 struct underlying_type
 {
   typedef __underlying_type(_Tp) type;
 };
 
-  /// declval
   template
 struct __declval_protector
 {
@@ -1892,7 +1899,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
::type>::value>  \
 { };
 
-  // @} group metaprogramming
+  /// @} group metaprogramming
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
Index: include/bits/move.h
===
--- include/bits/move.h (revision 181992)
+++ include/bits/move.h (revision 181993)
@@ -38,6 +38,10 @@ namespace std _GLIBCXX_VISIBILITY(defaul
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Used, in C++03 mode too, by allocators, etc.
+  /**
+   *  @brief Same as C++11 std::addressof
+   *  @ingroup utilities
+   */
   template
 inline _Tp*
 __addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
@@

Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Jack Howarth
On Sat, Dec 03, 2011 at 10:45:18PM -0800, Mike Stump wrote:
> On Dec 3, 2011, at 7:25 AM, Jack Howarth wrote:
> > FSF gcc currently doesn't handle -fno-pie and friends properly under Lion.
> > The darwin11 linker now defaults to -pie
> 
> > Okay for gcc trunk and backports to gcc-4_5-branch/gcc-4_6-branch?
> 
> Ok.

Mike,
   Thanks for the commit. This leaves us with the boehm-gc testsuite failures...

FAIL: boehm-gc.c/gctest.c -O2 execution test
FAIL: boehm-gc.c/leak_test.c -O2 execution test
FAIL: boehm-gc.c/thread_leak_test.c -O2 execution test
FAIL: boehm-gc.lib/staticrootstest.c -O2 execution test

at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker default. Iain had 
wanted
to leave these in place to encourage boehm-gc to be fixed but I doubt that is a 
realistic
goal in the near/middle term. Perhaps we could patch 
boehm-gc/testsuite/lib/boehm-gc.exp
to pass -fno-pie on darwin (now that it is functional)?
 Jack


Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Mike Stump
On Dec 4, 2011, at 9:09 AM, Jack Howarth  wrote:
> On Sat, Dec 03, 2011 at 10:45:18PM -0800, Mike Stump wrote:
>> On Dec 3, 2011, at 7:25 AM, Jack Howarth wrote:
>>> FSF gcc currently doesn't handle -fno-pie and friends properly under Lion.
>>> The darwin11 linker now defaults to -pie
>> 
>>> Okay for gcc trunk and backports to gcc-4_5-branch/gcc-4_6-branch?
>> 
>> Ok.
> 
> Mike,
>   Thanks for the commit. This leaves us with the boehm-gc testsuite 
> failures...
> 
> FAIL: boehm-gc.c/gctest.c -O2 execution test
> FAIL: boehm-gc.c/leak_test.c -O2 execution test
> FAIL: boehm-gc.c/thread_leak_test.c -O2 execution test
> FAIL: boehm-gc.lib/staticrootstest.c -O2 execution test
> 
> at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker default. Iain 
> had wanted
> to leave these in place to encourage boehm-gc to be fixed but I doubt that is 
> a realistic
> goal in the near/middle term. Perhaps we could patch 
> boehm-gc/testsuite/lib/boehm-gc.exp
> to pass -fno-pie on darwin (now that it is functional)?

I think we should just find a way to add -fno-pie...  Are there any flags that 
are added because we are doing gc that we can key off of?


Re: Yet another issue with gcc current trunk with ada on cygwin: s-tpoaal.adb:60:13: "Specific" is undefined (more references follow)

2011-12-04 Thread Eric Botcazou
>   Actually, the real problem is that the Cygwin-targeted version of gnat
> shouldn't need those definitions in the first place.  Cygwin provides a
> fairly complete Linux/Posix feature-set, and doing an end-run around it by
> using the underlying winsock API isn't usually a good idea, so I think that
> the better solution is to switch over to the standard berksock
> implementation.

This sounds fine.

>   * gcc-interface/Makefile.in (WIN_TARG_SUFFIX): New variable, used by
>   windows targets only to differentiate between MinGW and Cygwin.
>   (LIBGNAT_TARGET_PAIRS [windows targets]): Correctly detect cygwin,
>   which no longer has the '32' suffix, and use WIN_TARG_SUFFIX to choose
>   appropriate implementations of the sockets and memory packages.

Ouch. :-)  Please do something like this:

  # On Cygwin, we use the default version of s-memory and g-socthi because
  # . 
  ifeq ($(strip $(filter-out cygwin%,$(osys))),)
LIBGNAT_TARGET_PAIRS = \
  s-memory.adb   * sysdep.c (WIN_SETMODE): New define to choose the correct spelling of
>   setmode/_setmode for MinGW and Cygwin, respectively.
>   (__gnat_set_binary_mode [windows targets]): Use the above, and enable
>   the windows version for Cygwin as well as MinGW.
>   (__gnat_set_text_mode [windows targets]): Likewise.
>   (__gnat_ttyname [windows targets]): Provide a Cygwin implementation
>   in addition to the MinGW version.
>   (__gnat_is_windows_xp): Make available to Cygwin as well as MinGW.
>   (__gnat_get_stack_bounds): Likewise.

OK with the above change, thanks for fixing this.

-- 
Eric Botcazou


[Patch, Fortran] PR51407 - allow BOZ edit descriptors for REAL/COMPLEX

2011-12-04 Thread Tobias Burnus

Hi all,

as Dominique has found, Fortran 2008 allows the BOZ edit descriptors now 
also with REAL and COMPLEX arguments. (See PR for quotes from the standard.)


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias

PS: Thank you, Mikael, for reviewing my ASSOCIATE patch!
2011-12-04  Tobias Burnus  

	PR fortran/51407
	* io/transfer.c (require_numeric_type): New function.
	(formatted_transfer_scalar_read, formatted_transfer_scalar_write):
	Use it, allow BOZ edit descriptors with F2008.

2011-12-04  Tobias Burnus  

	PR fortran/51407
	* gfortran.dg/io_real_boz_3.f90: New.
	* gfortran.dg/io_real_boz_4.f90: New.
	* gfortran.dg/io_real_boz_5.f90: New.

diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 976102f..f71e96f 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -1063,6 +1063,25 @@ require_type (st_parameter_dt *dtp, bt expected, bt actual, const fnode *f)
 }
 
 
+static int
+require_numeric_type (st_parameter_dt *dtp, bt actual, const fnode *f)
+{
+#define BUFLEN 100
+  char buffer[BUFLEN];
+
+  if (actual == BT_INTEGER || actual == BT_REAL || actual == BT_COMPLEX)
+return 0;
+
+  /* Adjust item_count before emitting error message.  */
+  snprintf (buffer, BUFLEN, 
+	"Expected numeric type for item %d in formatted transfer, got %s",
+	dtp->u.p.item_count - 1, type_name (actual));
+
+  format_error (dtp, f, buffer);
+  return 1;
+}
+
+
 /* This function is in the main loop for a formatted data transfer
statement.  It would be natural to implement this as a coroutine
with the user program, but C makes that awkward.  We loop,
@@ -1147,6 +1166,9 @@ formatted_transfer_scalar_read (st_parameter_dt *dtp, bt type, void *p, int kind
 	  if (n == 0)
 	goto need_read_data;
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  read_radix (dtp, f, p, kind, 2);
@@ -1156,6 +1178,9 @@ formatted_transfer_scalar_read (st_parameter_dt *dtp, bt type, void *p, int kind
 	  if (n == 0)
 	goto need_read_data; 
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  read_radix (dtp, f, p, kind, 8);
@@ -1165,6 +1190,9 @@ formatted_transfer_scalar_read (st_parameter_dt *dtp, bt type, void *p, int kind
 	  if (n == 0)
 	goto need_read_data;
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  read_radix (dtp, f, p, kind, 16);
@@ -1548,6 +1576,9 @@ formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kin
 	  if (n == 0)
 	goto need_data;
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  write_b (dtp, f, p, kind);
@@ -1557,6 +1588,9 @@ formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kin
 	  if (n == 0)
 	goto need_data; 
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  write_o (dtp, f, p, kind);
@@ -1566,6 +1600,9 @@ formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kin
 	  if (n == 0)
 	goto need_data;
 	  if (!(compile_options.allow_std & GFC_STD_GNU)
+	  && require_numeric_type (dtp, type, f))
+	return;
+	  if (!(compile_options.allow_std & GFC_STD_F2008)
   && require_type (dtp, BT_INTEGER, type, f))
 	return;
 	  write_z (dtp, f, p, kind);
--- /dev/null	2011-12-04 08:20:24.719594993 +0100
+++ gcc/gcc/testsuite/gfortran.dg/io_real_boz_3.f90	2011-12-04 17:18:46.0 +0100
@@ -0,0 +1,34 @@
+! { dg-do  run }
+! { dg-options "-std=f2008" }
+!
+! PR fortran/51407
+!
+! Fortran 2008 allows BOZ edit descriptors for real/complex.
+!
+   real(kind=4) :: x
+   complex(kind=4) :: z
+   character(len=64) :: str1
+
+   x = 1.0_16 + 2.0_16**(-105)
+   z = cmplx (1.0, 2.0)
+
+   write (str1,'(b32)') x
+   read (str1,'(b32)') x
+   write (str1,'(o32)') x
+   read (str1,'(o32)') x
+   write (str1,'(z32)') x
+   read (str1,'(z32)') x
+   write (str1,'(b0)') x
+   write (str1,'(o0)') x
+   write (str1,'(z0)') x
+
+   write (str1,'(2b32)') z
+   read (str1,'(2b32)') z
+   write (str1,'(2o32)') z
+   read (str1,'(2o32)') z
+   write 

Re: Fix ICE in call to out-of-line __sync_lock_test_and_set

2011-12-04 Thread Richard Henderson
On 12/04/2011 03:07 AM, Richard Sandiford wrote:
>   * optabs.c (maybe_emit_sync_lock_test_and_set): Pass a null target
>   to emit_library_call_value.
>   (expand_atomic_compare_and_swap): Likewise.

Ok.


r~


[C++ Patch] PR 51404

2011-12-04 Thread Paolo Carlini

Hi,

for this ice on invalid, 4.7 Regression, the idea is just early 
returning error_mark_node from build_functional_cast, after the error, 
like in all the other error conditions explicitly dealt with there, 
instead of setting type = error_mark_node.


The catch is, for testcases like auto25.C:

template struct A
{
  int a[auto(1)]; // { dg-error "invalid use of" }
};

we don't want to add an additional redundant error message saying that 
array bound is not an integer constant. Thus the tweak to 
cp_parser_direct_declarator. As-is, patch tested x86_64-linux without 
regressions.


Thanks,
Paolo.


/cp
2011-12-04  Paolo Carlini  

PR c++/51404
* typeck2.c (build_functional_cast): Early return error_mark_node
for invalid uses of 'auto'.
* parser.c (cp_parser_direct_declarator): When non_constant_p
and cp_parser_constant_expression returns error do not produce
further diagnostic for the bound.

/testsuite
2011-12-04  Paolo Carlini  

PR c++/51404
* g++.dg/cpp0x/auto28.C: New.

Index: testsuite/g++.dg/cpp0x/auto28.C
===
--- testsuite/g++.dg/cpp0x/auto28.C (revision 0)
+++ testsuite/g++.dg/cpp0x/auto28.C (revision 0)
@@ -0,0 +1,4 @@
+// PR c++/51404
+// { dg-options -std=c++0x }
+
+int i = auto().x;  // { dg-error "invalid use of" }
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 181984)
+++ cp/typeck2.c(working copy)
@@ -1653,7 +1653,7 @@ build_functional_cast (tree exp, tree parms, tsubs
 {
   if (complain & tf_error)
error ("invalid use of %");
-  type = error_mark_node;
+  return error_mark_node;
 }
 
   if (processing_template_decl)
Index: cp/parser.c
===
--- cp/parser.c (revision 181984)
+++ cp/parser.c (working copy)
@@ -16055,23 +16055,26 @@ cp_parser_direct_declarator (cp_parser* parser,
 &non_constant_p);
  if (!non_constant_p)
/* OK */;
- /* Normally, the array bound must be an integral constant
-expression.  However, as an extension, we allow VLAs
-in function scopes as long as they aren't part of a
-parameter declaration.  */
- else if (!parser->in_function_body
-  || current_binding_level->kind == sk_function_parms)
+ else if (!error_operand_p (bounds))
{
- cp_parser_error (parser,
-  "array bound is not an integer constant");
- bounds = error_mark_node;
+ /* Normally, the array bound must be an integral constant
+expression.  However, as an extension, we allow VLAs
+in function scopes as long as they aren't part of a
+parameter declaration.  */
+ if (!parser->in_function_body
+ || current_binding_level->kind == sk_function_parms)
+   {
+ cp_parser_error (parser, "array bound is not an "
+  "integer constant");
+ bounds = error_mark_node;
+   }
+ else if (processing_template_decl)
+   {
+ /* Remember this wasn't a constant-expression.  */
+ bounds = build_nop (TREE_TYPE (bounds), bounds);
+ TREE_SIDE_EFFECTS (bounds) = 1;
+   }
}
- else if (processing_template_decl && !error_operand_p (bounds))
-   {
- /* Remember this wasn't a constant-expression.  */
- bounds = build_nop (TREE_TYPE (bounds), bounds);
- TREE_SIDE_EFFECTS (bounds) = 1;
-   }
}
  else
bounds = NULL_TREE;


[coverage] Some restructuring

2011-12-04 Thread Nathan Sidwell
I've committed this patch to break apart the gcov finalization routines, I 
believe this will make it easier to fix the problem shown up by bug 51113 -- 
although this patch does not.  Notable changes:


* rename coverage_begin_output to coverage_begin_function for consistency with 
coverage_end_function


* Only call it once from the profile code -- I suspect in the past it was called 
from several locations where it wasn't possible to statically determine which 
would be the first call.  Now it's obvious.


* Move the opening of the notes file to the coverage initialization.  This does 
mean that we'll always create a notes file when generating coverage data, even 
if there are no functions in the resultant object file.  I don't think that's 
unsurprising -- and will stop gcov itself complaining about a missing notes file 
in this case.


* Replace the trailing array of the gcov_info object with a pointer to an array. 
 This doesn't require changes to libgcov.c's source but will of course change 
the resultant object file it compiles to and constitutes an ABI change.  This 
change allows the creation of the gcov_info data type without knowing the number 
of functions being instrumented.


tested on i686-pc-linux-gnu with profiledbootstrap.

nathan
2011-12-04  Nathan Sidwell  

* gcov-io.h (struct gcov_info): Replace trailing array with
pointer to array.
* profile.c (branch_prob): Only call renamed
coverage_begin_function once.
* coverage.h (coverage_begin_output): Rename to ...
(coverage_begin_function): ... here.
* coverage.c (struct function_list): Rename to ...
(struct coverage_data): ... this.  Update all uses.
(gcov_info_var, gcov_fn_info_type, gcov_fn_info_ptr_type): New
globals.
(bbg_file_opened, bbg_function_announced): Remove.
(get_coverage_counts): Adjust message.
(coverage_begin_ouput): Rename to ...
(coverage_begin_function): ... here.  Move file opening to
coverage_init.  Adjust for being called only once.
(coverage_end_function): Remove bbg file and inhibit further
output here on error.
(build_info_type): Adjust for change to pointer to array.
(build_info): Receive array of function pointers and adjust.
(create_coverage): Break into ...
(coverage_obj_init, coverage_obj_fn, coverage_obj_finish):
... these, and adjust.
(coverage_init): Open the notes file here.  Tidy.
(coverage_finish): Call coverage_obj_init etc.

Index: gcov-io.h
===
--- gcov-io.h   (revision 181984)
+++ gcov-io.h   (working copy)
@@ -448,8 +448,8 @@ struct gcov_info
  unused) */
   
   unsigned n_functions;/* number of functions */
-  const struct gcov_fn_info *functions[0]; /* pointers to function
- information  */
+  const struct gcov_fn_info *const *functions; /* pointer to pointers
+ to function information  */
 };
 
 /* Register a new object file module.  */
Index: profile.c
===
--- profile.c   (revision 181984)
+++ profile.c   (working copy)
@@ -1110,30 +1110,25 @@ branch_prob (void)
   lineno_checksum = coverage_compute_lineno_checksum ();
 
   /* Write the data from which gcov can reconstruct the basic block
- graph.  */
+ graph and function line numbers  */
 
-  /* Basic block flags */
-  if (coverage_begin_output (lineno_checksum, cfg_checksum))
+  if (coverage_begin_function (lineno_checksum, cfg_checksum))
 {
   gcov_position_t offset;
 
+  /* Basic block flags */
   offset = gcov_write_tag (GCOV_TAG_BLOCKS);
   for (i = 0; i != (unsigned) (n_basic_blocks); i++)
gcov_write_unsigned (0);
   gcov_write_length (offset);
-}
-
-   /* Keep all basic block indexes nonnegative in the gcov output.
-  Index 0 is used for entry block, last index is for exit block.
-  */
-  ENTRY_BLOCK_PTR->index = 1;
-  EXIT_BLOCK_PTR->index = last_basic_block;
 
-  /* Arcs */
-  if (coverage_begin_output (lineno_checksum, cfg_checksum))
-{
-  gcov_position_t offset;
+  /* Keep all basic block indexes nonnegative in the gcov output.
+Index 0 is used for entry block, last index is for exit
+block.*/
+  ENTRY_BLOCK_PTR->index = 1;
+  EXIT_BLOCK_PTR->index = last_basic_block;
 
+  /* Arcs */
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb)
{
  edge e;
@@ -1168,11 +1163,11 @@ branch_prob (void)
 
  gcov_write_length (offset);
}
-}
 
-  /* Line numbers.  */
-  if (coverage_begin_output (lineno_checksum, cfg_checksum))
-{
+  ENTRY_BLOCK_PTR->index = ENTRY_BLOCK;
+  EXIT_BLOCK_PTR->index = EXIT_BLOCK;
+
+  /* Line numbers

Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Jack Howarth
On Sun, Dec 04, 2011 at 09:28:55AM -0800, Mike Stump wrote:
> On Dec 4, 2011, at 9:09 AM, Jack Howarth  wrote:
> > On Sat, Dec 03, 2011 at 10:45:18PM -0800, Mike Stump wrote:
> >> On Dec 3, 2011, at 7:25 AM, Jack Howarth wrote:
> >>> FSF gcc currently doesn't handle -fno-pie and friends properly under Lion.
> >>> The darwin11 linker now defaults to -pie
> >> 
> >>> Okay for gcc trunk and backports to gcc-4_5-branch/gcc-4_6-branch?
> >> 
> >> Ok.
> > 
> > Mike,
> >   Thanks for the commit. This leaves us with the boehm-gc testsuite 
> > failures...
> > 
> > FAIL: boehm-gc.c/gctest.c -O2 execution test
> > FAIL: boehm-gc.c/leak_test.c -O2 execution test
> > FAIL: boehm-gc.c/thread_leak_test.c -O2 execution test
> > FAIL: boehm-gc.lib/staticrootstest.c -O2 execution test
> > 
> > at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker default. Iain 
> > had wanted
> > to leave these in place to encourage boehm-gc to be fixed but I doubt that 
> > is a realistic
> > goal in the near/middle term. Perhaps we could patch 
> > boehm-gc/testsuite/lib/boehm-gc.exp
> > to pass -fno-pie on darwin (now that it is functional)?
> 
> I think we should just find a way to add -fno-pie...  Are there any flags 
> that are added because we are doing gc that we can key off of?

Mike,
   The simple approach would be...

Index: boehm-gc/testsuite/boehm-gc.c/c.exp
===
--- boehm-gc/testsuite/boehm-gc.c/c.exp (revision 181993)
+++ boehm-gc/testsuite/boehm-gc.c/c.exp (working copy)
@@ -17,6 +17,6 @@
 dg-init
 boehm-gc-init
 
-boehm-gc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "-O2" ""
+boehm-gc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "-O2 
-fno-pie" ""
 
 dg-finish
Index: boehm-gc/testsuite/boehm-gc.lib/lib.exp
===
--- boehm-gc/testsuite/boehm-gc.lib/lib.exp (revision 181993)
+++ boehm-gc/testsuite/boehm-gc.lib/lib.exp (working copy)
@@ -21,6 +21,6 @@ boehm-gc-init
 set tests [lsort [glob -nocomplain $srcdir/$subdir/*.c]]
 set tests [prune $tests $srcdir/$subdir/*lib.c]
 
-boehm-gc-dg-runtest $tests "-O2" ""
+boehm-gc-dg-runtest $tests "-O2 -fno-pie" ""
 
 dg-finish

which yields...

=== boehm-gc tests ===

Schedule of variations:
unix/-m32
unix/-m64

Running target unix/-m32
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/config/default.exp
 as tool-and-target-specific interface file.
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/boehm-gc.c/c.exp
 ...
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/boehm-gc.lib/lib.exp
 ...

=== boehm-gc Summary for unix/-m32 ===

# of expected passes12
# of unsupported tests  1
Running target unix/-m64
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/config/default.exp
 as tool-and-target-specific interface file.
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/boehm-gc.c/c.exp
 ...
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20111202/boehm-gc/testsuite/boehm-gc.lib/lib.exp
 ...

=== boehm-gc Summary for unix/-m64 ===

# of expected passes12
# of unsupported tests  1

=== boehm-gc Summary ===

# of expected passes24
# of unsupported tests  2

I would argue that this is useful in that it reminds the developers that 
boehm-gc isn't PIE friendly
and needs to be fixed.
 Jack


Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Jakub Jelinek
On Sun, Dec 04, 2011 at 02:00:20PM -0500, Jack Howarth wrote:
> > > at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker default. 
> > > Iain had wanted
> > > to leave these in place to encourage boehm-gc to be fixed but I doubt 
> > > that is a realistic
> > > goal in the near/middle term. Perhaps we could patch 
> > > boehm-gc/testsuite/lib/boehm-gc.exp
> > > to pass -fno-pie on darwin (now that it is functional)?
> > 
> > I think we should just find a way to add -fno-pie...  Are there any
> > flags that are added because we are doing gc that we can key off of?

-f{pic,PIC,pie,PIE,no-pic} aren't option that should have any effect on how
are binaries/shared libraries linked, these options control solely
compilation.  -shared, -pie or lack of these options determines how are
things linked.  So, either you should pass -no-pie or whatever linker option
you need to generate position dependent binaries by default, unless -shared
or -pie is specified, or you should add -no-pie or something similar, but
IMHO it shouldn't be -fno-pie, that is a compilation option/too similar to
them.

Jakub


Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Jack Howarth
On Sun, Dec 04, 2011 at 08:18:32PM +0100, Jakub Jelinek wrote:
> On Sun, Dec 04, 2011 at 02:00:20PM -0500, Jack Howarth wrote:
> > > > at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker default. 
> > > > Iain had wanted
> > > > to leave these in place to encourage boehm-gc to be fixed but I doubt 
> > > > that is a realistic
> > > > goal in the near/middle term. Perhaps we could patch 
> > > > boehm-gc/testsuite/lib/boehm-gc.exp
> > > > to pass -fno-pie on darwin (now that it is functional)?
> > > 
> > > I think we should just find a way to add -fno-pie...  Are there any
> > > flags that are added because we are doing gc that we can key off of?
> 
> -f{pic,PIC,pie,PIE,no-pic} aren't option that should have any effect on how
> are binaries/shared libraries linked, these options control solely
> compilation.  -shared, -pie or lack of these options determines how are
> things linked.  So, either you should pass -no-pie or whatever linker option
> you need to generate position dependent binaries by default, unless -shared
> or -pie is specified, or you should add -no-pie or something similar, but
> IMHO it shouldn't be -fno-pie, that is a compilation option/too similar to
> them.

Jakub,
   This isn't really an option on darwin11 and later since the linker defaults 
to
-pie and this results in warnings from the linker of the form...

ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not 
allowed in code signed PIE, but used in _f from /var/tmp//ccVNy9V9.o. To fix 
this warning, don't compile with -mdynamic-no-pic or link with -Wl,-no_pie

Author: mrs
Date: Sun Dec  4 07:09:56 2011
New Revision: 181982

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181982
Log:
2011-12-03  Jack Howarth  

   * config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC):
 Pass -no_pie for non-PIC code when targeting 10.7 or later.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/darwin10.h

is designed to inhibit this linker noise by explicitly disabling pie linkage. 
Also, FSF gcc previously wasn't
honoring -fno-pie since this didn't inhibit the darwin11 linker from creating a 
PIE executable. 
   Unfortunately the linker option -no_pie isn't recognized by earlier darwin 
linkers which makes -fno-pie
preferred since it only passes -no_pie to the linker when targeting 10.7 or 
later. So we may have to make
the change darwin-specific.
Jack


> 
>   Jakub


Re: [Patch PPC/Darwin] some tidy-ups for save_world (and a prelude to splitting it out of the rs6000 code).

2011-12-04 Thread Iain Sandoe


On 4 Dec 2011, at 06:35, Mike Stump wrote:


On Nov 30, 2011, at 6:28 AM, Iain Sandoe wrote:
While trying to track down the vector unwind problems on ppc- 
darwin, I made some tidy-ups for "save_world()".
In the end, that was not where the main problem, lay - but I did  
find a few things wrong there on the way - they should be fixed,  
even if there's no specific bug filed at present.


I'm also attaching a second patch which is purely cosmetic white- 
space/comment tidies I'd also like to apply.


checked on powerpc-darwin{8-G4,9-G5} and crosses to powerpc-eabisim  
and powerpc-ibm-aix6.1.3.0



-  /* On Darwin, the unwind routines are compiled without
- TARGET_ALTIVEC, and use save_world to save/restore the
- altivec registers when necessary.  */
+  /* When generating code for earlier versions of Darwin, which  
might run on
+ hardware with or without Altivec, we use out-of-line save/ 
restores in
+ function prologues/epilogues that require it.  These routines  
determine

+ whether to save/restore Altivec at runtime.  */


So, we need to first settle on how the library is compiled before  
this becomes true...


The patch does not change any functionality here.
This was a ( clearly failed ;-) ) attempt to make the comment more  
useful to newcomers to the code ...

(withdrawn, can think again sometime later).


+save_world:
+   trap
+
+   .private_extern eh_rest_world_r10
+eh_rest_world_r10:
+   trap


So, I can't see the necessity of doing this.  I think it is bad  
style to generate code that will die at runtime.  I'd rather have it  
not link.


done - there would now be a link fail if it were specified.

(although I would point out, FTR, that the code would have always  
crashed if it had ever been invoked - since it was only m32 capable).


[[FWIW, I made a working m64 version, which we could resurrect should  
there be some reason we decided that out-of-line world saves were  
useful everywhere]]


OK for trunk?
Iain

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 181990)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -19899,7 +19899,9 @@ rs6000_emit_prologue (void)
  used in this function, and do the corresponding magic in the
  epilogue.  */
 
-  if (TARGET_ALTIVEC && TARGET_ALTIVEC_VRSAVE
+  if (!WORLD_SAVE_P (info)
+  && TARGET_ALTIVEC
+  && TARGET_ALTIVEC_VRSAVE
   && info->vrsave_mask != 0)
 {
   rtx reg, mem, vrsave;
@@ -19915,15 +19917,12 @@ rs6000_emit_prologue (void)
   else
 emit_insn (gen_rtx_SET (VOIDmode, reg, vrsave));
 
-  if (!WORLD_SAVE_P (info))
-{
-  /* Save VRSAVE.  */
-  offset = info->vrsave_save_offset + sp_offset;
-  mem = gen_frame_mem (SImode,
-   gen_rtx_PLUS (Pmode, frame_reg_rtx,
- GEN_INT (offset)));
-  insn = emit_move_insn (mem, reg);
-}
+  /* Save VRSAVE.  */
+  offset = info->vrsave_save_offset + sp_offset;
+  mem = gen_frame_mem (SImode,
+  gen_rtx_PLUS (Pmode, frame_reg_rtx, 
+GEN_INT (offset)));
+  insn = emit_move_insn (mem, reg);
 
   /* Include the registers in the mask.  */
   emit_insn (gen_iorsi3 (reg, reg, GEN_INT ((int) info->vrsave_mask)));
Index: libgcc/config/rs6000/darwin-world.S
===
--- libgcc/config/rs6000/darwin-world.S (revision 181990)
+++ libgcc/config/rs6000/darwin-world.S (working copy)
@@ -24,6 +24,8 @@
  * .
  */ 
 
+#ifndef __ppc64__
+
.machine ppc7400
 .data
.align 2
@@ -33,12 +35,7 @@
 .non_lazy_symbol_pointer
 L_has_vec$non_lazy_ptr:
.indirect_symbol __cpu_has_altivec
-#ifdef __ppc64__
-   .quad   0
-#else
.long   0
-#endif
-
 #else
 
 /* For static, "pretend" we have a non-lazy-pointer.  */
@@ -57,12 +54,11 @@ L_has_vec$non_lazy_ptr:
provided by the System Framework to determine this.)
 
SAVE_WORLD takes R0 (the caller`s caller`s return address) and R11
-   (the stack frame size) as parameters.  It returns VRsave in R0 if
-   we`re on a CPU with vector regs.
+   (the stack frame size) as parameters.  It returns the updated VRsave
+   in R0 if we`re on a CPU with vector regs.
 
-   With gcc3, we now need to save and restore CR as well, since gcc3's
-   scheduled prologs can cause comparisons to be moved before calls to
-   save_world!
+   For gcc3 onward, we need to save and restore CR as well, since scheduled
+   prologs can cause comparisons to be moved before calls to save_world.
 
USES: R0 R11 R12  */
 
@@ -143,69 +139,62 @@ L$saveVMX:
stvx v30,r11,r12
mfspr r0,VRsave
li r11,-16
-   stvx v31,r11,r12
-   /* VRsave lives at -224(R1)  */
-   stw r0,0(r12)

Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Iain Sandoe


On 4 Dec 2011, at 20:19, Jack Howarth wrote:


On Sun, Dec 04, 2011 at 08:18:32PM +0100, Jakub Jelinek wrote:

On Sun, Dec 04, 2011 at 02:00:20PM -0500, Jack Howarth wrote:
at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker  
default. Iain had wanted
to leave these in place to encourage boehm-gc to be fixed but I  
doubt that is a realistic
goal in the near/middle term. Perhaps we could patch boehm-gc/ 
testsuite/lib/boehm-gc.exp

to pass -fno-pie on darwin (now that it is functional)?


I think we should just find a way to add -fno-pie...  Are there any
flags that are added because we are doing gc that we can key off  
of?


-f{pic,PIC,pie,PIE,no-pic} aren't option that should have any  
effect on how

are binaries/shared libraries linked, these options control solely
compilation.  -shared, -pie or lack of these options determines how  
are
things linked.  So, either you should pass -no-pie or whatever  
linker option
you need to generate position dependent binaries by default, unless  
-shared
or -pie is specified, or you should add -no-pie or something  
similar, but
IMHO it shouldn't be -fno-pie, that is a compilation option/too  
similar to

them.


Jakub,
  This isn't really an option on darwin11 and later since the linker  
defaults to

-pie and this results in warnings from the linker of the form...

ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no- 
pic) not allowed in code signed PIE, but used in _f from /var/tmp// 
ccVNy9V9.o. To fix this warning, don't compile with -mdynamic-no-pic  
or link with -Wl,-no_pie


perhaps the question should be "why is this particualr code being  
built with non-default options?".
x86 darwin code should default to fPIC - so someone must be passing - 
mdynamic-no-pic or -fno-PIC etc.


(perhaps derived from the bootstrap usage of the "-mdynamic-no-pic"  
option - which suggests that this should be disabled for Darwin >= 11).


Iain



Re: [RFC] Port libitm to powerpc

2011-12-04 Thread Richard Henderson
On 12/03/2011 09:20 AM, Iain Sandoe wrote:
> version 2 is a modification of your original:
> 
> a)  -FRAME+BASE(r1) cannot be guaranteed to be vec-aligned in general (it 
> isn't on m32 darwin)
> 
> ... so I've taken the liberty of rounding the gtm_buffer object and then 
> pointing r4 at  original_sp-rounded_size, which is what we want for the call 
> to GTM_begin_transaction anyway.

I've kept this in the version below, but I cannot see how that can be,
since your version of BASE is 8*WS = 64, a multiple of 16.

> b) I've added the CR etc. wrapped in  __MACH__ ifdefs.

Taken out of the ifdefs to be done everywhere.

> c) ";" is a comment introducer for Darwin's asm .. so I unwrapped those lines 
> ...

Sure.

> d) I put in the logic for handling __USER_LABEL_PREFIX__ .

Merged into the FUNC / END / HIDDEN macros.

> e) The real problem is finding a non-horrible way of dealing with the %r <=> 
> r issue - and I've not done that so far...

Dropped the %r entirely and using bare numbers, which is what the compiler
emits by default.  I kept the %[rfv] in the cfi directives though;
I assume that darwin simply doesn't have those and so it won't be an issue.
Worse come to worse, we can map those to raw dwarf columns, but I thought
this was more readable in case we can keep them.

Give this a go.  The full tree is

  git://repo.or.cz/gcc/rth.git rth/tm-next

which is what I actually tested on ppc64-linux, but I've extracted the middle
three patches from that tree, which ought to apply to mainline.


r~
diff --git a/libitm/config/generic/asmcfi.h b/libitm/config/generic/asmcfi.h
index 4344d6f..0727f41 100644
--- a/libitm/config/generic/asmcfi.h
+++ b/libitm/config/generic/asmcfi.h
@@ -1,4 +1,3 @@
-
 /* Copyright (C) 2011 Free Software Foundation, Inc.
Contributed by Richard Henderson .
 
@@ -32,6 +31,9 @@
 #define cfi_def_cfa_offset(n)  .cfi_def_cfa_offset n
 #define cfi_def_cfa(r,n)   .cfi_def_cfa r, n
 #define cfi_register(o,n)  .cfi_register o, n
+#define cfi_offset(r,o).cfi_offset r, o
+#define cfi_restore(r) .cfi_restore r
+#define cfi_undefined(r)   .cfi_undefined r
 
 #else
 
@@ -40,5 +42,8 @@
 #define cfi_def_cfa_offset(n)
 #define cfi_def_cfa(r,n)
 #define cfi_register(o,n)
+#define cfi_offset(r,o)
+#define cfi_restore(r)
+#define cfi_undefined(r)
 
 #endif /* HAVE_AS_CFI_PSEUDO_OP */
diff --git a/libitm/config/linux/powerpc/futex_bits.h 
b/libitm/config/linux/powerpc/futex_bits.h
new file mode 100644
index 000..5587fca
--- /dev/null
+++ b/libitm/config/linux/powerpc/futex_bits.h
@@ -0,0 +1,54 @@
+/* Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Richard Henderson .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include 
+
+static inline long
+sys_futex0 (int *addr, int op, int val)
+{
+  register long int r0  __asm__ ("r0");
+  register long int r3  __asm__ ("r3");
+  register long int r4  __asm__ ("r4");
+  register long int r5  __asm__ ("r5");
+  register long int r6  __asm__ ("r6");
+
+  r0 = SYS_futex;
+  r3 = (long) addr;
+  r4 = op;
+  r5 = val;
+  r6 = 0;
+
+  /* ??? The powerpc64 sysdep.h file clobbers ctr; the powerpc32 sysdep.h
+ doesn't.  It doesn't much matter for us.  In the interest of unity,
+ go ahead and clobber it always.  */
+
+  __asm volatile ("sc; mfcr %0"
+ : "=r"(r0), "=r"(r3), "=r"(r4), "=r"(r5), "=r"(r6)
+ : "r"(r0), "r"(r3), "r"(r4), "r"(r5), "r"(r6)
+ : "r7", "r8", "r9", "r10", "r11", "r12",
+   "cr0", "ctr", "memory");
+  if (__builtin_expect (r0 & (1 << 28), 0))
+return r3;
+  return 0;
+}
diff --git a/libitm/config/powerpc/cacheline.h 
b/libitm/config/powerpc/cacheline.h
new file mode 100644
index 000..e20cfec
--- /dev/null
+++ b/libitm/config/powerpc/cacheline.h
@@ -0,0 +1,38 @@
+/* Copyright (C) 2009, 2011 Free Software Foundation, Inc.
+   Contributed by Richard Henderson .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm 

Re: [PATCH] pass -no_pie to LINK_GCC_C_SEQUENCE_SPEC on darwin

2011-12-04 Thread Jack Howarth
On Sun, Dec 04, 2011 at 08:35:07PM +, Iain Sandoe wrote:
>
> On 4 Dec 2011, at 20:19, Jack Howarth wrote:
>
>> On Sun, Dec 04, 2011 at 08:18:32PM +0100, Jakub Jelinek wrote:
>>> On Sun, Dec 04, 2011 at 02:00:20PM -0500, Jack Howarth wrote:
>> at -m32/-m64 on x86_64-apple-darwin11 due to the -pie linker  
>> default. Iain had wanted
>> to leave these in place to encourage boehm-gc to be fixed but I 
>> doubt that is a realistic
>> goal in the near/middle term. Perhaps we could patch boehm-gc/ 
>> testsuite/lib/boehm-gc.exp
>> to pass -fno-pie on darwin (now that it is functional)?
>
> I think we should just find a way to add -fno-pie...  Are there any
> flags that are added because we are doing gc that we can key off  
> of?
>>>
>>> -f{pic,PIC,pie,PIE,no-pic} aren't option that should have any effect 
>>> on how
>>> are binaries/shared libraries linked, these options control solely
>>> compilation.  -shared, -pie or lack of these options determines how  
>>> are
>>> things linked.  So, either you should pass -no-pie or whatever  
>>> linker option
>>> you need to generate position dependent binaries by default, unless  
>>> -shared
>>> or -pie is specified, or you should add -no-pie or something  
>>> similar, but
>>> IMHO it shouldn't be -fno-pie, that is a compilation option/too  
>>> similar to
>>> them.
>>
>> Jakub,
>>   This isn't really an option on darwin11 and later since the linker  
>> defaults to
>> -pie and this results in warnings from the linker of the form...
>>
>> ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no- 
>> pic) not allowed in code signed PIE, but used in _f from /var/tmp// 
>> ccVNy9V9.o. To fix this warning, don't compile with -mdynamic-no-pic  
>> or link with -Wl,-no_pie
>
> perhaps the question should be "why is this particualr code being built 
> with non-default options?".
> x86 darwin code should default to fPIC - so someone must be passing - 
> mdynamic-no-pic or -fno-PIC etc.

Iain,
   I was only pointing out that we are in a unique situation when targeting
darwin11 and later since we are the only target currently creating PIE by
default. However unlike linux, darwin doesn't require or even encourage the
use of -fpie/-fPIE to do this but silently relies on the default -fPIC. Also, 
the
warning cited above was from gcc.dg/20020312-2.c which previously failed its
excessive error test since it passed -fno-pic which triggered the warning.
Jack

>
> (perhaps derived from the bootstrap usage of the "-mdynamic-no-pic"  
> option - which suggests that this should be disabled for Darwin >= 11).
>
> Iain


Re: [PATCH] Make sibcall argument overlap check less pessimistic (PR middle-end/50074, take 2)

2011-12-04 Thread Eric Botcazou
> What about this way?  I've groupped the two variables into a structure
> to make it clear it is internal internal_arg_pointer_based_exp* state,
> scanning is done in a separate function and the SCAN argument is gone,
> instead the internal_arg_pointer_based_exp_scan function disables scanning
> during recursion by tweaking the internal state.

Thanks.  I think this isn't exactly equivalent to the previous version though, 
as the recursive call made through internal_arg_pointer_based_exp_1 will now 
scan as well, won't it?  OK, my fault, the argument was probably better then.
But I think that we can keep the structure and the 2 functions for clarity.

Adjusted patch attached, same ChangeLog.  Please double-check, make the changes 
you deem necessary and install.

-- 
Eric Botcazou
Index: calls.c
===
--- calls.c	(revision 181902)
+++ calls.c	(working copy)
@@ -1658,6 +1658,129 @@ rtx_for_function_call (tree fndecl, tree
   return funexp;
 }
 
+/* Internal state for internal_arg_pointer_based_exp and its helpers.  */
+static struct
+{
+  /* Last insn that has been scanned by internal_arg_pointer_based_exp_scan,
+ or NULL_RTX if none has been scanned yet.  */
+  rtx scan_start;
+  /* Vector indexed by REGNO - FIRST_PSEUDO_REGISTER, recording if a pseudo is
+ based on crtl->args.internal_arg_pointer.  The element is NULL_RTX if the
+ pseudo isn't based on it, a CONST_INT offset if the pseudo is based on it
+ with fixed offset, or PC if this is with variable or unknown offset.  */
+  VEC(rtx, heap) *cache;
+} internal_arg_pointer_exp_state;
+
+static rtx internal_arg_pointer_based_exp (rtx, bool);
+
+/* Helper function for internal_arg_pointer_based_exp.  Scan insns in
+   the tail call sequence, starting with first insn that hasn't been
+   scanned yet, and note for each pseudo on the LHS whether it is based
+   on crtl->args.internal_arg_pointer or not, and what offset from that
+   that pointer it has.  */
+
+static void
+internal_arg_pointer_based_exp_scan (void)
+{
+  rtx insn, scan_start = internal_arg_pointer_exp_state.scan_start;
+
+  if (scan_start == NULL_RTX)
+insn = get_insns ();
+  else
+insn = NEXT_INSN (scan_start);
+
+  while (insn)
+{
+  rtx set = single_set (insn);
+  if (set && REG_P (SET_DEST (set)) && !HARD_REGISTER_P (SET_DEST (set)))
+	{
+	  rtx val = NULL_RTX;
+	  unsigned int idx = REGNO (SET_DEST (set)) - FIRST_PSEUDO_REGISTER;
+	  /* Punt on pseudos set multiple times.  */
+	  if (idx < VEC_length (rtx, internal_arg_pointer_exp_state.cache)
+	  && (VEC_index (rtx, internal_arg_pointer_exp_state.cache, idx)
+		  != NULL_RTX))
+	val = pc_rtx;
+	  else
+	val = internal_arg_pointer_based_exp (SET_SRC (set), false);
+	  if (val != NULL_RTX)
+	{
+	  VEC_safe_grow_cleared (rtx, heap,
+ internal_arg_pointer_exp_state.cache,
+ idx + 1);
+	  VEC_replace (rtx, internal_arg_pointer_exp_state.cache,
+			   idx, val);
+	}
+	}
+  if (NEXT_INSN (insn) == NULL_RTX)
+	scan_start = insn;
+  insn = NEXT_INSN (insn);
+}
+
+  internal_arg_pointer_exp_state.scan_start = scan_start;
+}
+
+/* Helper function for internal_arg_pointer_based_exp, called through
+   for_each_rtx.  Return 1 if *LOC is a register based on
+   crtl->args.internal_arg_pointer.  Return -1 if *LOC is not based on it
+   and the subexpressions need not be examined.  Otherwise return 0.  */
+
+static int
+internal_arg_pointer_based_exp_1 (rtx *loc, void *data ATTRIBUTE_UNUSED)
+{
+  if (REG_P (*loc) && internal_arg_pointer_based_exp (*loc, false) != NULL_RTX)
+return 1;
+  if (MEM_P (*loc))
+return -1;
+  return 0;
+}
+
+/* Compute whether RTL is based on crtl->args.internal_arg_pointer.  Return
+   NULL_RTX if RTL isn't based on it, a CONST_INT offset if RTL is based on
+   it with fixed offset, or PC if this is with variable or unknown offset.
+   TOPLEVEL is true if the function is invoked at the topmost level.  */
+
+static rtx
+internal_arg_pointer_based_exp (rtx rtl, bool toplevel)
+{
+  if (CONSTANT_P (rtl))
+return NULL_RTX;
+
+  if (rtl == crtl->args.internal_arg_pointer)
+return const0_rtx;
+
+  if (REG_P (rtl) && HARD_REGISTER_P (rtl))
+return NULL_RTX;
+
+  if (GET_CODE (rtl) == PLUS && CONST_INT_P (XEXP (rtl, 1)))
+{
+  rtx val = internal_arg_pointer_based_exp (XEXP (rtl, 0), toplevel);
+  if (val == NULL_RTX || val == pc_rtx)
+	return val;
+  return plus_constant (val, INTVAL (XEXP (rtl, 1)));
+}
+
+  /* When called at the topmost level, scan pseudo assignments in between the
+ last scanned instruction in the tail call sequence and the latest insn
+ in that sequence.  */
+  if (toplevel)
+internal_arg_pointer_based_exp_scan ();
+
+  if (REG_P (rtl))
+{
+  unsigned int idx = REGNO (rtl) - FIRST_PSEUDO_REGISTER;
+  if (idx < VEC_length (rtx, internal_arg_pointer_exp_state.cache))
+	return VEC_index (rtx, inte

Re: [libstdc++] doc/xml/manual/abi.xml -- fix references to GCC as well as GNU/Linux

2011-12-04 Thread Gerald Pfeifer
Hi Jonathan,

On Sat, 3 Dec 2011, Jonathan Wakely wrote:
> How's this?  I think I got all the versions and dates correct, but I
> must say I find keeping some of this info in the manual to be tedious
> and unnecessary.

I agree, there is (too) much detailed and extra contents there
which does not actually strike me as helpful.

> To deal with the tedious parts, I changed a few repetitive instances
> of 4.1.0, 4.1.1, 4.2.0, 4.2.1, 4.3.0 etc. etc. to just 4.x.x which
> will be accurate in future and can be changed if it needs to be,
> rather than having to keep adding new entries that say the headers for
> GCC 4.6.1 are in include/c++/4.6.1 and, guess what, the headers for
> GCC 4.6.2 are in include/c++/4.6.2
> 
> Would 4.*.* or 4.?.? be better than 4.x.x?

How about 4.x.y, to indicate that the second and third components
can be different?

> I'm not sure why we need to explicitly state the libgcc soname for
> every release when it's always the same.

Good point.  In fact, looking at your patch and the document, could
you just remove the third component in all cases (or nearly all)?  It
occurs to me that GCC 4.x, for fixed value of x, should be compatible,
if not identical in terms of characteristics, shouldn't it?

That would strike me as even more of a simplification.

And in those cases in your patch where it refers to GCC 3.3.x, for
example, it can just be GCC 3.3, so this also applies to regular
text, not just the tables.

> If noone objects to this approach I'll regenerate the HTML pages and 
> check this in at some point in the next few days.
> 
> If anyone objects, please find a volunteer to keep the tedious version
> up to date ;-)

I think with my proposal it'll become less tedious? :-)

Gerald


Re: [PATCH] Make sibcall argument overlap check less pessimistic (PR middle-end/50074, take 2)

2011-12-04 Thread Jakub Jelinek
On Sun, Dec 04, 2011 at 09:53:42PM +0100, Eric Botcazou wrote:
> > What about this way?  I've groupped the two variables into a structure
> > to make it clear it is internal internal_arg_pointer_based_exp* state,
> > scanning is done in a separate function and the SCAN argument is gone,
> > instead the internal_arg_pointer_based_exp_scan function disables scanning
> > during recursion by tweaking the internal state.
> 
> Thanks.  I think this isn't exactly equivalent to the previous version 
> though, 
> as the recursive call made through internal_arg_pointer_based_exp_1 will now 
> scan as well, won't it?  OK, my fault, the argument was probably better then.

I think it is.  Those called during internal_arg_pointer_based_exp_scan
will see scan_start equal to pc_rtx and won't scan, and for the calls after
it, while scan_start won't be pc_rtx, as it is after scan, it is either
NULL_RTX with no insns in the sequence, or some insn whose NEXT_INSN is
NULL, therefore it will attempt to scan, but won't scan a single insn.
But surely, if you prefer the explicit argument, I can test that version
too.

Jakub


Re: [RFC] Port libitm to powerpc

2011-12-04 Thread Iain Sandoe

Hi Richard,

On 4 Dec 2011, at 20:45, Richard Henderson wrote:


On 12/03/2011 09:20 AM, Iain Sandoe wrote:

version 2 is a modification of your original:

a)  -FRAME+BASE(r1) cannot be guaranteed to be vec-aligned in  
general (it isn't on m32 darwin)


... so I've taken the liberty of rounding the gtm_buffer object and  
then pointing r4 at  original_sp-rounded_size, which is what we  
want for the call to GTM_begin_transaction anyway.


I've kept this in the version below, but I cannot see how that can be,
since your version of BASE is 8*WS = 64, a multiple of 16.


BASE is 2* WS for sysv, 6 for AIX and is only 8 on Darwin by luck  
because we happen to have 2 params to the called routine...



b) I've added the CR etc. wrapped in  __MACH__ ifdefs.


Taken out of the ifdefs to be done everywhere.


where is "_CALL_DARWIN" supposed to come from? (it is not defined by  
the preprocessor AFAICT).


I can produce a patch to add it if that's an oversight in the Darwin  
port.


e) The real problem is finding a non-horrible way of dealing with  
the %r <=> r issue - and I've not done that so far...


Dropped the %r entirely and using bare numbers, which is what the  
compiler

emits by default.


not for the darwin version - it needs the 'r,v, and f' :-(
 (sorry, .. working on binutils but it's gonna take some more time)


 I kept the %[rfv] in the cfi directives though;
I assume that darwin simply doesn't have those and so it won't be an  
issue.


no - which is why I also need to cook up an eh_frame section by hand  
sometime... (likewise on Darwin x86)



Give this a go.


with edited in  'r,v,f'  _CALL_DARWIN defined on __MACH__

Works OK modulo  " bl name" needs the __USER_LABEL_PREFIX__

how about?

__ELF__
.macro  CALL name
bl \name
.endm
__MACH__
.macro  CALL name
bl _$0
.endmacro


cheers
Iain



Re: [PATCH] Make sibcall argument overlap check less pessimistic (PR middle-end/50074, take 2)

2011-12-04 Thread Eric Botcazou
> I think it is.  Those called during internal_arg_pointer_based_exp_scan
> will see scan_start equal to pc_rtx and won't scan, and for the calls after
> it, while scan_start won't be pc_rtx, as it is after scan, it is either
> NULL_RTX with no insns in the sequence, or some insn whose NEXT_INSN is
> NULL, therefore it will attempt to scan, but won't scan a single insn.

Right, internal_arg_pointer_based_exp_scan will be invoked for nothing.

> But surely, if you prefer the explicit argument, I can test that version
> too.

Yes, I think it is better in the end. :-)

-- 
Eric Botcazou


[C++ Patch] for c++/51319

2011-12-04 Thread Fabien Chêne
Hi,

The problem here seems to be that we don't perform the enumeration
constant resolving in finish_id_expression when the DECL is a
USING_DECL. Consequently, I think we shall strip the USING_DECL before
checking for a CONST_DECL.

Tested x86_64-unknown-linux-gnu without regressions. OK to commit ?

gcc/testsuite/ChangeLog

2011-12-02  Fabien Chêne  

PR c++/51319
* g++.dg/lookup/using50.C: New.
* g++.dg/lookup/using51.C: New.

gcc/cp/ChangeLog

2011-12-02  Fabien Chêne  

PR c++/51319
* semantics.c (finish_id_expression): Strip using declaration when
resolving enumeration constants to their underlying values.

-- 
Fabien


51319.patch
Description: Binary data


[patch committed SH] Add atomic patterns

2011-12-04 Thread Kaz Kojima
Hi,

The attached patch adds atomic patterns with software atomic
sequences.  They are enabled when a new option -msoft-atomic
is specified and the option is default for sh-linux.
Regtested on sh4-unknown-linux-gnu with no new failures and
the doc patch is tested with "make info dvi pdf".
Applied on trunk.

Regards,
kaz
--
2011-12-04  Kaz Kojima  

* config/sh/linux.h (TARGET_DEFAULT): Add MASK_SOFT_ATOMIC.
* config/sh/sync.md: New file.
* config/sh/sh.md: Include sync.md.
* config/sh/sh.opt (msoft-atomic): New option.
* doc/invoke.texi (SH Options): Document it.

diff -uprN ORIG/trunk/gcc/config/sh/linux.h trunk/gcc/config/sh/linux.h
--- ORIG/trunk/gcc/config/sh/linux.h2011-11-13 09:19:44.0 +0900
+++ trunk/gcc/config/sh/linux.h 2011-12-04 08:05:43.0 +0900
@@ -41,7 +41,7 @@ along with GCC; see the file COPYING3.  
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT \
   (TARGET_CPU_DEFAULT | MASK_USERMODE | TARGET_ENDIAN_DEFAULT \
-   | TARGET_OPT_DEFAULT)
+   | TARGET_OPT_DEFAULT | MASK_SOFT_ATOMIC)
 
 #define TARGET_ASM_FILE_END file_end_indicate_exec_stack
 
diff -uprN ORIG/trunk/gcc/config/sh/sh.md trunk/gcc/config/sh/sh.md
--- ORIG/trunk/gcc/config/sh/sh.md  2011-12-03 10:03:41.0 +0900
+++ trunk/gcc/config/sh/sh.md   2011-12-04 08:05:43.0 +0900
@@ -13667,3 +13667,5 @@ mov.l\\t1f,r0\\n\\
   "ld%M1.q\t%m1, %0\;ld%M2.q\t%m2, %3\;cmpeq\t%0, %3, %0\;movi\t0, %3"
   [(set_attr "type" "other")
(set_attr "length" "16")])
+
+(include "sync.md")
diff -uprN ORIG/trunk/gcc/config/sh/sh.opt trunk/gcc/config/sh/sh.opt
--- ORIG/trunk/gcc/config/sh/sh.opt 2010-10-23 09:51:12.0 +0900
+++ trunk/gcc/config/sh/sh.opt  2011-12-05 07:56:25.0 +0900
@@ -1,6 +1,6 @@
 ; Options for the SH port of the compiler.
 
-; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010
+; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011
 ; Free Software Foundation, Inc.
 ;
 ; This file is part of GCC.
@@ -319,6 +319,10 @@ mrenesas
 Target Mask(HITACHI) MaskExists
 Follow Renesas (formerly Hitachi) / SuperH calling conventions
 
+msoft-atomic
+Target Report Mask(SOFT_ATOMIC)
+Use software atomic sequences supported by kernel
+
 mspace
 Target RejectNegative Alias(Os)
 Deprecated.  Use -Os instead
diff -uprN ORIG/trunk/gcc/config/sh/sync.md trunk/gcc/config/sh/sync.md
--- ORIG/trunk/gcc/config/sh/sync.md1970-01-01 09:00:00.0 +0900
+++ trunk/gcc/config/sh/sync.md 2011-12-04 08:14:20.0 +0900
@@ -0,0 +1,312 @@
+;; GCC machine description for SH synchronization instructions.
+;; Copyright (C) 2011
+;; Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_c_enum "unspec" [
+  UNSPEC_ATOMIC
+])
+ 
+(define_c_enum "unspecv" [
+  UNSPECV_CMPXCHG_1
+  UNSPECV_CMPXCHG_2
+  UNSPECV_CMPXCHG_3
+])
+
+(define_mode_iterator I124 [QI HI SI])
+
+(define_mode_attr i124suffix [(QI "b") (HI "w") (SI "l")])
+(define_mode_attr i124extend_insn [(QI "exts.b") (HI "exts.w") (SI "mov")])
+
+(define_code_iterator FETCHOP [plus minus ior xor and])
+(define_code_attr fetchop_name
+  [(plus "add") (minus "sub") (ior "ior") (xor "xor") (and "and")])
+(define_code_attr fetchop_insn
+  [(plus "add") (minus "sub") (ior "or") (xor "xor") (and "and")])
+
+;; Linux specific atomic patterns for the Renesas / SuperH SH CPUs.
+;; Linux kernel for SH3/4 has implemented the support for software
+;; atomic sequences.
+
+(define_expand "atomic_compare_and_swap"
+  [(match_operand:QI 0 "register_operand" "")  ;; bool success output
+   (match_operand:I124 1 "register_operand" "");; oldval output
+   (match_operand:I124 2 "memory_operand" "")  ;; memory
+   (match_operand:I124 3 "register_operand" "");; expected 
input
+   (match_operand:I124 4 "register_operand" "");; newval input
+   (match_operand:SI 5 "const_int_operand" "") ;; is_weak
+   (match_operand:SI 6 "const_int_operand" "") ;; success model
+   (match_operand:SI 7 "const_int_operand" "")];; failure model
+  "TARGET_SOFT_ATOMIC && !TARGET_SHMEDIA"
+{
+  rtx addr;
+
+  addr = force_reg (Pmode, XEXP (operands[2], 0));
+  emit_insn (gen_atomic_compare_and_swap_soft
+(gen_lowpart (SImode, operands[1]), addr, operands[3],
+ operands[4])

Re: [libstdc++] doc/xml/manual/abi.xml -- fix references to GCC as well as GNU/Linux

2011-12-04 Thread Jonathan Wakely
On 4 December 2011 21:08, Gerald Pfeifer wrote:
> Hi Jonathan,
>
> On Sat, 3 Dec 2011, Jonathan Wakely wrote:
>> How's this?  I think I got all the versions and dates correct, but I
>> must say I find keeping some of this info in the manual to be tedious
>> and unnecessary.
>
> I agree, there is (too) much detailed and extra contents there
> which does not actually strike me as helpful.
>
>> To deal with the tedious parts, I changed a few repetitive instances
>> of 4.1.0, 4.1.1, 4.2.0, 4.2.1, 4.3.0 etc. etc. to just 4.x.x which
>> will be accurate in future and can be changed if it needs to be,
>> rather than having to keep adding new entries that say the headers for
>> GCC 4.6.1 are in include/c++/4.6.1 and, guess what, the headers for
>> GCC 4.6.2 are in include/c++/4.6.2
>>
>> Would 4.*.* or 4.?.? be better than 4.x.x?
>
> How about 4.x.y, to indicate that the second and third components
> can be different?
>
>> I'm not sure why we need to explicitly state the libgcc soname for
>> every release when it's always the same.
>
> Good point.  In fact, looking at your patch and the document, could
> you just remove the third component in all cases (or nearly all)?  It
> occurs to me that GCC 4.x, for fixed value of x, should be compatible,
> if not identical in terms of characteristics, shouldn't it?
>
> That would strike me as even more of a simplification.
>
> And in those cases in your patch where it refers to GCC 3.3.x, for
> example, it can just be GCC 3.3, so this also applies to regular
> text, not just the tables.
>
>> If noone objects to this approach I'll regenerate the HTML pages and
>> check this in at some point in the next few days.
>>
>> If anyone objects, please find a volunteer to keep the tedious version
>> up to date ;-)
>
> I think with my proposal it'll become less tedious? :-)

Yep, here's another patch with some more duplication removed.  WIth
this, the document only needs to be updated when a new symbol version
is added or a library filename changes, not for every point release
with identical library versions.  I think I'm quite happy with this
and will commit in a couple of days if noone objects.
Index: doc/xml/manual/abi.xml
===
--- doc/xml/manual/abi.xml  (revision 181993)
+++ doc/xml/manual/abi.xml  (working copy)
@@ -164,28 +164,14 @@ compatible.
 
 
 
-gcc-3.0.0: libgcc_s.so.1
-gcc-3.0.1: libgcc_s.so.1
-gcc-3.0.2: libgcc_s.so.1
-gcc-3.0.3: libgcc_s.so.1
-gcc-3.0.4: libgcc_s.so.1
-gcc-3.1.0: libgcc_s.so.1
-gcc-3.1.1: libgcc_s.so.1
-gcc-3.2.0: libgcc_s.so.1
-gcc-3.2.1: libgcc_s.so.1
-gcc-3.2.2: libgcc_s.so.1
-gcc-3.2.3: libgcc_s.so.1
-gcc-3.3.0: libgcc_s.so.1
-gcc-3.3.1: libgcc_s.so.1
-gcc-3.3.2: libgcc_s.so.1
-gcc-3.3.3: libgcc_s.so.1
-gcc-3.4.x, gcc-4.[0-5].x: libgcc_s.so.1
+GCC 3.x: libgcc_s.so.1
+GCC 4.x: libgcc_s.so.1
 
 
 For m68k-linux the versions differ as follows: 
 
 
-gcc-3.4.x, gcc-4.[0-5].x: libgcc_s.so.1
+GCC 3.4, GCC 4.x: libgcc_s.so.1
 when configuring --with-sjlj-exceptions, or
 libgcc_s.so.2  
 
@@ -193,10 +179,10 @@ compatible.
 For hppa-linux the versions differ as follows: 
 
 
-gcc-3.4.x, gcc-4.[0-1].x: either libgcc_s.so.1
+GCC 3.4, GCC 4.[0-1]: either libgcc_s.so.1
 when configuring --with-sjlj-exceptions, or
 libgcc_s.so.2  
-gcc-4.[2-5].x: either libgcc_s.so.3 when configuring
+GCC 4.[2-7]: either libgcc_s.so.3 when configuring
 --with-sjlj-exceptions) or libgcc_s.so.4
  
 
@@ -213,19 +199,22 @@ compatible.
 
 This corresponds to the mapfile: gcc/libgcc-std.ver
 
-gcc-3.0.0: GCC_3.0
-gcc-3.3.0: GCC_3.3
-gcc-3.3.1: GCC_3.3.1
-gcc-3.3.2: GCC_3.3.2
-gcc-3.3.4: GCC_3.3.4
-gcc-3.4.0: GCC_3.4
-gcc-3.4.2: GCC_3.4.2
-gcc-3.4.4: GCC_3.4.4
-gcc-4.0.0: GCC_4.0.0
-gcc-4.1.0: GCC_4.1.0
-gcc-4.2.0: GCC_4.2.0
-gcc-4.3.0: GCC_4.3.0
-gcc-4.4.0: GCC_4.4.0
+GCC 3.0.0: GCC_3.0
+GCC 3.3.0: GCC_3.3
+GCC 3.3.1: GCC_3.3.1
+GCC 3.3.2: GCC_3.3.2
+GCC 3.3.4: GCC_3.3.4
+GCC 3.4.0: GCC_3.4
+GCC 3.4.2: GCC_3.4.2
+GCC 3.4.4: GCC_3.4.4
+GCC 4.0.0: GCC_4.0.0
+GCC 4.1.0: GCC_4.1.0
+GCC 4.2.0: GCC_4.2.0
+GCC 4.3.0: GCC_4.3.0
+GCC 4.4.0: GCC_4.4.0
+GCC 4.5.0: GCC_4.5.0
+GCC 4.6.0: GCC_4.6.0
+GCC 4.7.0: GCC_4.7.0
 
 
 
@@ -241,54 +230,47 @@ compatible.
DT_SONAMEs are forward-compatibile: in
the table below, releases incompatible with the previous
one are explicitly noted.
+   If a particular release is not listed, its libstdc++.so binary
+   has the same filename and DT_SONAME as the
+   preceding release.
   
 
 It is versioned as follows:
 
 
-gcc-3.0.0: libstdc++.so.3.0.0
-gcc-3.0.1: libstdc++.so.3.0.1
-gcc-3.0.2: libstdc++.so.3.0.2
-gcc-3.0.3: libstdc++.so.3.0.2 (See 

Re: [C++ Patch] for c++/51319

2011-12-04 Thread Jason Merrill

Is there a reason not to just do

  decl = strip_using_decl (decl);

early in finish_id_expression?

Jason


Re: [testsuite] Adding missing dg-require-profiling directives

2011-12-04 Thread Chung-Lin Tang
On 2011/12/5 12:39 AM, Mike Stump wrote:
> On Dec 4, 2011, at 3:29 AM, Richard Sandiford  
> wrote:
>> The problem is that MIPS has
>> native TLS support, but the ABI has not "yet" been extended to MIPS16.
>> MIPS16 is supposed to be link-compatible with non-MIPS16, so we can't
>> use emultls, and must simply say sorry().
>>
>> This patch adds dg-require-profiling to the affected tests.  The reason
>> I haven't just applied it as obvious is that dg-require-profiling really
>> seems to be a test for link-time and runtime support.  There are presumably
>> targets that can't link profiling code but that are nevertheless happily
>> compiling the tests below.  So do we want to split the directive into two?
>> I ask the question while hoping the answer is "no". :-)
> 
> Hum...  I'd rather TLS support be defined and added for MIPS16...  I think we 
> have enough targets with profiling and TLS that coverage won't be lost with 
> your change.  I like simple.  If someone feels strongly about splitting, I'll 
> pre-approve their change.  I think your patch is fine.  Ok.

We already have a MIPS16 TLS implementation internally, I'll get it
ready to post here soon, though I'm afraid it's a next-stage1 kind of
modification (unless Richard has the rights to approve it at this stage?).

Thanks,
Chung-Lin


Re: [Patch] Increase array sizes in vect-tests to enable 256-bit vectorization

2011-12-04 Thread Michael Zolotukhin
Ok, will several tests with short arrays be enough for that or should
we keep all the original tests plus new ones with longer arrays?

Michael

On 4 December 2011 15:44, Richard Guenther  wrote:
> On Sat, Dec 3, 2011 at 5:54 PM, Michael Zolotukhin
>  wrote:
>>> I mean, that, when 256-bit vectorization is enabled we still use 128bit
>>> vectorization if the arrays are too short for 256bit vectorization.  You'll
>>> lose this test coverage when you change the array sizes.
>> That's true, but do we need all these test both with short and long
>> arrays? We could have the tests with increased sizes and compile them
>> with/without use of avx, thus testing both 256- and 128- bit
>> vectorization. Additionally, we might want to add several tests with
>> short arrays to check what happens if 256-bit is available, but arrays
>> is too short for it. I mean we don't need to duplicate all of the
>> tests to check this situation.
>
> Well, initially those tests served as a way to prove that dual-size
> vectorization
> works.  You should not remove this testing functionality.
>
> Richard.
>
>> On 3 December 2011 18:31, Richard Guenther  
>> wrote:
>>> On Fri, Dec 2, 2011 at 6:39 PM, Michael Zolotukhin
>>>  wrote:
>
> Shouldn't we add a variant for each testcase so that we still
> excercise both 128-bit and 256-bit vectorization paths?

 These tests are still good to test 128-bit vectorization, the changes
 was made just to make sure that 256-bit vectorization is possible on
 the tests.

 Actually, It's just first step in enabling these tests for 256 bits -
 for now many of them are failing if '-mavx' or '-mavx2' is specified
 (mostly due to different diagnostics messages produced by vectorizer),
 but with original (small) sizes of arrays we couldn't even check that.
 When they are enabled, it'll be possible to use them for testing both
 128- and 256- bit vectorization.
>>>
>>> I mean, that, when 256-bit vectorization is enabled we still use 128bit
>>> vectorization if the arrays are too short for 256bit vectorization.  You'll
>>> lose this test coverage when you change the array sizes.
>>>
>>> Richard.
>>>
 Michael


 2011/12/2 Richard Guenther :
> 2011/12/2 Michael Zolotukhin :
>> Hi,
>>
>> This patch increases array sizes in tests from vect.exp suite, thus
>> enabling 256-bit vectorization where it's available.
>>
>> Ok for trunk?
>
> Shouldn't we add a variant for each testcase so that we still
> excercise both 128-bit and 256-bit vectorization paths?
>
>> Changelog:
>> 2011-12-02  Michael Zolotukhin  
>>
>>        * gcc.dg/vect/slp-13.c: Increase array size, add initialization.
>>        * gcc.dg/vect/slp-24.c: Ditto.
>>        * gcc.dg/vect/slp-3.c: Likewise and fix scans.
>>        * gcc.dg/vect/slp-34.c: Ditto.
>>        * gcc.dg/vect/slp-4.c: Ditto.
>>        * gcc.dg/vect/slp-cond-2.c: Ditto.
>>        * gcc.dg/vect/slp-multitypes-11.c: Ditto.
>>        * gcc.dg/vect/vect-1.c: Ditto.
>>        * gcc.dg/vect/vect-10.c: Ditto.
>>        * gcc.dg/vect/vect-105.c: Ditto.
>>        * gcc.dg/vect/vect-112.c: Ditto.
>>        * gcc.dg/vect/vect-15.c: Ditto.
>>        * gcc.dg/vect/vect-2.c: Ditto.
>>        * gcc.dg/vect/vect-31.c: Ditto.
>>        * gcc.dg/vect/vect-32.c: Ditto.
>>        * gcc.dg/vect/vect-33.c: Ditto.
>>        * gcc.dg/vect/vect-34.c: Ditto.
>>        * gcc.dg/vect/vect-35.c: Ditto.
>>        * gcc.dg/vect/vect-36.c: Ditto.
>>        * gcc.dg/vect/vect-6.c: Ditto.
>>        * gcc.dg/vect/vect-73.c: Ditto.
>>        * gcc.dg/vect/vect-74.c: Ditto.
>>        * gcc.dg/vect/vect-75.c: Ditto.
>>        * gcc.dg/vect/vect-76.c: Ditto.
>>        * gcc.dg/vect/vect-80.c: Ditto.
>>        * gcc.dg/vect/vect-85.c: Ditto.
>>        * gcc.dg/vect/vect-89.c: Ditto.
>>        * gcc.dg/vect/vect-97.c: Ditto.
>>        * gcc.dg/vect/vect-98.c: Ditto.
>>        * gcc.dg/vect/vect-all.c: Ditto.
>>        * gcc.dg/vect/vect-double-reduc-6.c: Ditto.
>>        * gcc.dg/vect/vect-iv-8.c: Ditto.
>>        * gcc.dg/vect/vect-iv-8a.c: Ditto.
>>        * gcc.dg/vect/vect-outer-1.c: Ditto.
>>        * gcc.dg/vect/vect-outer-1a.c: Ditto.
>>        * gcc.dg/vect/vect-outer-1b.c: Ditto.
>>        * gcc.dg/vect/vect-outer-2.c: Ditto.
>>        * gcc.dg/vect/vect-outer-2a.c: Ditto.
>>        * gcc.dg/vect/vect-outer-2c.c: Ditto.
>>        * gcc.dg/vect/vect-outer-3.c: Ditto.
>>        * gcc.dg/vect/vect-outer-3a.c: Ditto.
>>        * gcc.dg/vect/vect-outer-4a.c: Ditto.
>>        * gcc.dg/vect/vect-outer-4b.c: Ditto.
>>        * gcc.dg/vect/vect-outer-4c.c: Ditto.
>>        * gcc.dg/vect/vect-outer-4d.c: Ditto.
>>        * gcc.dg/vect/vect-outer-4m.c: Ditto.
>>        * gcc.dg/vect/vect-outer-fir-lb.c: Ditto.
>>  

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
On Fri, Dec 2, 2011 at 11:59 AM, Xinliang David Li  wrote:
> ;
>>
>> +/* Determine whether LOOP contains floating-point computation. */
>> +bool
>> +loop_has_FP_comp(struct loop *loop)
>> +{
>> +  rtx set, dest;
>
> This probably should be extended to detect other long latency
> operations in the future.
>
>
>> +
>> +  if (ix86_tune != PROCESSOR_COREI7_64 &&
>> +      ix86_tune != PROCESSOR_COREI7_32)
>> +    return nunroll;
>
> Is it better to generalize it and model the LSD and LSD size in the
> target model description? -- probably a different patch for that.

Yes, I thought it made sense to keep the check here for now, but it
could be generalized that way to handle the limits in different
implementations.

>
>
>> +
>> +  /* Look for instructions that store a constant into HImode (16-bit)
>> +     memory. These require a length-changing prefix and on corei7 are
>> +     prone to LCP stalls. These stalls can be avoided if the loop
>> +     is streamed from the loop stream detector. */
>> +  body = get_loop_body (loop);
>> +  for (i = 0; i < loop->num_nodes && !found; i++)
>> +    {
>> +      bb = body[i];
>> +
>> +      FOR_BB_INSNS (bb, insn)
>> +        {
>> +          rtx set_expr;
>> +          set_expr = single_set (insn);
>> +          if (set_expr != NULL_RTX
>> +              && GET_MODE (SET_DEST (set_expr)) == HImode
>> +              && CONST_INT_P (SET_SRC (set_expr))
>> +              && MEM_P (SET_DEST (set_expr)))
>> +            {
>> +              found = true;
>> +              break;
>> +            }
>> +        }
>> +    }
>> +  free (body);
>
>
> Probably generalize this to handle other long latency FE stalls -- for
> now it only handles LCP stalls.
>
>> +
>> +  if (!found)
>> +    return nunroll;
>> +
>> +  /* Don't reduce unroll factor in loops with floating point
>> +     computation, which tend to benefit more heavily from
>> +     larger unroll factors and are less likely to bottleneck
>> +     at the decoder. */
>> +  has_FP = loop_has_FP_comp(loop);
>> +  if (has_FP)
>> +    return nunroll;
>> +
>> +  if (dump_file)
>> +    {
>> +      fprintf (dump_file,
>> +               ";; Loop contains HImode store of const (possible LCP
>> stalls),\n");
>> +      fprintf (dump_file,
>> +               "   reduce unroll factor to fit into Loop Stream 
>> Detector\n");
>> +    }
>> +
>> +  /* On corei7 the loop stream detector can hold about 28 instructions, so
>> +     don't allow unrolling to exceed that. */
>> +  newunroll = 28 / loop->av_ninsns;
>
> Is 28 number of instructions or number of uOps?

It is actually 28 uops, I have updated the comments in the latest
patch, which I am sending out in a follow-on email.

Thanks,
Teresa

>
> thanks,
>
> David
>
>> +  if (newunroll < nunroll)
>> +    return newunroll;
>> +
>> +  return nunroll;
>> +}
>> +
>>  /* Initialize the GCC target structure.  */
>>  #undef TARGET_RETURN_IN_MEMORY
>>  #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
>> @@ -38685,6 +38755,9 @@ ix86_autovectorize_vector_sizes (void)
>>  #define TARGET_INIT_LIBFUNCS darwin_rename_builtins
>>  #endif
>>
>> +#undef TARGET_LOOP_UNROLL_ADJUST
>> +#define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust
>> +
>>  struct gcc_target targetm = TARGET_INITIALIZER;
>>  ^L
>>  #include "gt-i386.h"
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
Latest patch which improves the efficiency as described below is
included here. Boostrapped and checked again with
x86_64-unknown-linux-gnu. Could someone review?

Thanks,
Teresa

2011-12-04  Teresa Johnson  

* loop-unroll.c (decide_unroll_constant_iterations): Call loop
unroll target hook.
* config/i386/i386.c (ix86_loop_unroll_adjust): New function.
(TARGET_LOOP_UNROLL_ADJUST): Define hook for x86.

===
--- loop-unroll.c   (revision 181902)
+++ loop-unroll.c   (working copy)
@@ -547,6 +547,9 @@ decide_unroll_constant_iterations (struc
   if (nunroll > (unsigned) PARAM_VALUE (PARAM_MAX_UNROLL_TIMES))
 nunroll = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);

+  if (targetm.loop_unroll_adjust)
+nunroll = targetm.loop_unroll_adjust (nunroll, loop);
+
   /* Skip big loops.  */
   if (nunroll <= 1)
 {
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 181902)
+++ config/i386/i386.c  (working copy)
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.
 #include "fibheap.h"
 #include "opts.h"
 #include "diagnostic.h"
+#include "cfgloop.h"

 enum upper_128bits_state
 {
@@ -38370,6 +38371,82 @@ ix86_autovectorize_vector_sizes (void)
   return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }

+/* If LOOP contains a possible LCP stalling instruction on corei7,
+   calculate new number of times to unroll instead of NUNROLL so that
+   the unrolled loop will still likely fit into the loop stream detector. */
+static unsigned
+ix86_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
+{
+  basic_block *body, bb;
+  unsigned i;
+  rtx insn;
+  bool found = false;
+  unsigned newunroll;
+
+  if (ix86_tune != PROCESSOR_COREI7_64 &&
+  ix86_tune != PROCESSOR_COREI7_32)
+return nunroll;
+
+  /* Look for instructions that store a constant into HImode (16-bit)
+ memory. These require a length-changing prefix and on corei7 are
+ prone to LCP stalls. These stalls can be avoided if the loop
+ is streamed from the loop stream detector. */
+  body = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+{
+  bb = body[i];
+
+  FOR_BB_INSNS (bb, insn)
+{
+  rtx set_expr, dest;
+  set_expr = single_set (insn);
+  if (!set_expr)
+continue;
+
+  dest = SET_DEST (set_expr);
+
+  /* Don't reduce unroll factor in loops with floating point
+ computation, which tend to benefit more heavily from
+ larger unroll factors and are less likely to bottleneck
+ at the decoder. */
+  if (FLOAT_MODE_P (GET_MODE (dest)))
+  {
+free (body);
+return nunroll;
+  }
+
+  if (!found
+  && GET_MODE (dest) == HImode
+  && CONST_INT_P (SET_SRC (set_expr))
+  && MEM_P (dest))
+{
+  found = true;
+  /* Keep walking loop body to look for FP computations above. */
+}
+}
+}
+  free (body);
+
+  if (!found)
+return nunroll;
+
+  if (dump_file)
+{
+  fprintf (dump_file,
+   ";; Loop contains HImode store of const (possible LCP
stalls),\n");
+  fprintf (dump_file,
+   "   reduce unroll factor to fit into Loop Stream Detector\n");
+}
+
+  /* On corei7 the loop stream detector can hold 28 uops, so
+ don't allow unrolling to exceed that many instructions. */
+  newunroll = 28 / loop->av_ninsns;
+  if (newunroll < nunroll)
+return newunroll;
+
+  return nunroll;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
@@ -38685,6 +38762,9 @@ ix86_autovectorize_vector_sizes (void)
 #define TARGET_INIT_LIBFUNCS darwin_rename_builtins
 #endif

+#undef TARGET_LOOP_UNROLL_ADJUST
+#define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust
+
 struct gcc_target targetm = TARGET_INITIALIZER;


 #include "gt-i386.h"


On Fri, Dec 2, 2011 at 12:11 PM, Teresa Johnson  wrote:
> On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen  wrote:
>> Teresa Johnson  writes:
>>
>> Interesting optimization. I would be concerned a little bit
>> about compile time, does it make a measurable difference?
>
> I haven't measured compile time explicitly, but I don't it should,
> especially after I address your efficiency suggestion (see below),
> since it will just have one pass over the instructions in innermost
> loops.
>
>>
>>> The attached patch detects loops containing instructions that tend to
>>> incur high LCP (loop changing prefix) stalls on Core i7, and limits
>>> their unroll factor to try to keep the unrolled loop body small enough
>>> to fit in the Corei7's loop stream detector which can hide LCP stalls
>>> in loops.
>>
>> One more optimization would be to optimize padding for this case,
>>

Re: [C++ Patch] for c++/51319

2011-12-04 Thread Fabien Chêne
2011/12/5 Jason Merrill :
> Is there a reason not to just do
>
>  decl = strip_using_decl (decl);
>
> early in finish_id_expression?

Not really, I've already tried it and it works. I wasn't sure it was
correct not to return a USING_DECL in aIl cases -- they are numerous
in this huge function. If you think it is more correct, I'am all for
it.

-- 
Fabien