Re: [patch tree-optimization]: Move tree-vrp to use binary instead of truth-expressions

2011-07-26 Thread Kai Tietz
I adjusted logic in patch for interger zero/all-one case for bit
and/or.  By simply copying the variable operand to destination,
without checking for valid ranges for and-expression with all-ones and
or-expression with zero operand,  logic could be simplified pretty
much.
I adjusted names for variables and removed unnecessary helper variable
about ranges.
I didn't noticed that the range-check-function doesn't return in all
cases true for a partial ranged variable. Thanks for the heads-up.

Regression tested for all languages and boostrapped on host
x86_64-pc-linux-gnu.  Ok for apply?

Regards,
Kai





Index: gcc-head/gcc/tree-vrp.c
===
--- gcc-head.orig/gcc/tree-vrp.c
+++ gcc-head/gcc/tree-vrp.c
@@ -2187,9 +2187,7 @@ extract_range_from_binary_expr (value_ra
   && code != MIN_EXPR
   && code != MAX_EXPR
   && code != BIT_AND_EXPR
-  && code != BIT_IOR_EXPR
-  && code != TRUTH_AND_EXPR
-  && code != TRUTH_OR_EXPR)
+  && code != BIT_IOR_EXPR)
 {
   /* We can still do constant propagation here.  */
   tree const_op0 = op_with_constant_singleton_value_range (op0);
@@ -2244,8 +2242,7 @@ extract_range_from_binary_expr (value_ra
  divisions.  TODO, we may be able to derive anti-ranges in
  some cases.  */
   if (code != BIT_AND_EXPR
-  && code != TRUTH_AND_EXPR
-  && code != TRUTH_OR_EXPR
+  && code != BIT_IOR_EXPR
   && code != TRUNC_DIV_EXPR
   && code != FLOOR_DIV_EXPR
   && code != CEIL_DIV_EXPR
@@ -2267,7 +2264,12 @@ extract_range_from_binary_expr (value_ra
   || POINTER_TYPE_P (TREE_TYPE (op0))
   || POINTER_TYPE_P (TREE_TYPE (op1)))
 {
-  if (code == MIN_EXPR || code == MAX_EXPR)
+  if (code == BIT_IOR_EXPR)
+{
+ set_value_range_to_varying (vr);
+ return;
+   }
+  else if (code == MIN_EXPR || code == MAX_EXPR)
{
  /* For MIN/MAX expressions with pointers, we only care about
 nullness, if both are non null, then the result is nonnull.
@@ -2312,57 +2314,9 @@ extract_range_from_binary_expr (value_ra

   /* For integer ranges, apply the operation to each end of the
  range and see what we end up with.  */
-  if (code == TRUTH_AND_EXPR
-  || code == TRUTH_OR_EXPR)
-{
-  /* If one of the operands is zero, we know that the whole
-expression evaluates zero.  */
-  if (code == TRUTH_AND_EXPR
- && ((vr0.type == VR_RANGE
-  && integer_zerop (vr0.min)
-  && integer_zerop (vr0.max))
- || (vr1.type == VR_RANGE
- && integer_zerop (vr1.min)
- && integer_zerop (vr1.max
-   {
- type = VR_RANGE;
- min = max = build_int_cst (expr_type, 0);
-   }
-  /* If one of the operands is one, we know that the whole
-expression evaluates one.  */
-  else if (code == TRUTH_OR_EXPR
-  && ((vr0.type == VR_RANGE
-   && integer_onep (vr0.min)
-   && integer_onep (vr0.max))
-  || (vr1.type == VR_RANGE
-  && integer_onep (vr1.min)
-  && integer_onep (vr1.max
-   {
- type = VR_RANGE;
- min = max = build_int_cst (expr_type, 1);
-   }
-  else if (vr0.type != VR_VARYING
-  && vr1.type != VR_VARYING
-  && vr0.type == vr1.type
-  && !symbolic_range_p (&vr0)
-  && !overflow_infinity_range_p (&vr0)
-  && !symbolic_range_p (&vr1)
-  && !overflow_infinity_range_p (&vr1))
-   {
- /* Boolean expressions cannot be folded with int_const_binop.  */
- min = fold_binary (code, expr_type, vr0.min, vr1.min);
- max = fold_binary (code, expr_type, vr0.max, vr1.max);
-   }
-  else
-   {
- /* The result of a TRUTH_*_EXPR is always true or false.  */
- set_value_range_to_truthvalue (vr, expr_type);
- return;
-   }
-}
-  else if (code == PLUS_EXPR
-  || code == MIN_EXPR
-  || code == MAX_EXPR)
+  if (code == PLUS_EXPR
+  || code == MIN_EXPR
+  || code == MAX_EXPR)
 {
   /* If we have a PLUS_EXPR with two VR_ANTI_RANGEs, drop to
 VR_VARYING.  It would take more effort to compute a precise
@@ -2694,6 +2648,8 @@ extract_range_from_binary_expr (value_ra
   bool int_cst_range0, int_cst_range1;
   double_int may_be_nonzero0, may_be_nonzero1;
   double_int must_be_nonzero0, must_be_nonzero1;
+  value_range_t *non_singleton_vr;
+  tree singleton_val;

   vr0_int_cst_singleton_p = range_int_cst_singleton_p (&vr0);
   vr1_int_cst_singleton_p = range_int_cst_singleton_p (&vr1);
@@ -2702,9 +2658,39 @@ extract_range_from_binary_expr (value_ra
   int_cst_range1 = zero_nonzero_bits_from_vr (&vr1, &may_be_nonzero1,
  &must_be_nonzero1);

+

[RS6000] asynch exceptions and unwind info

2011-07-26 Thread Alan Modra
Hi David,
  I've been looking into what we need to do to support unwinding from
async signal handlers.  I've implemented unwind info generation for
.glink in the linker, but to keep the ppc64 .glink unwind info simple
I've assumed that frob_update_context is still used.

We still have some difficulties related to r2 tracking on
ppc64. frob_update_context doesn't quite do the right thing for
async unwinding.  A typical (no-r11) plt call stub looks like

 addis r12,2,off@ha
 std 2,40(1)
 ld 11,off@l(12)
 mtctr 11
 ld 2,off+8@l(12)
 bctr

or, when the offset from r2 to the function descriptor is small

 std 2,40(1)
 ld 11,off(2)
 mtctr 11
 ld 2,off+8(2)
 bctr

Now if we're stopped before the save of r2 we obviously don't want the
unwinder to restore r2 from 40(1), but that's exactly what the current
unwinder does.

Also, there is a one insn window where frob_update_context may do the
wrong thing for gcc generated calls via function pointer, which
typically looks like

 ld 0,0(r)
 std 2,40(1)
 mtctr 0
 ld 2,8(r)
 bctrl
 ld 2,40(1)

Here, if we are stopped after the "ld 2,8(r)" then r2 needs to be
restored from 40(1).

The following patch fixes these two issues.  Ideally what I'd like to
do is have ld and gcc emit accurate r2 tracking unwind info and
dispense with hacks like frob_update_context.  If ld did emit accurate
unwind info for .glink, then the justification for frob_update_context
disappears.  The difficulty then is backwards compatibility.  You'd
need a way for the gcc unwinder to handle a mix of old code (that
needs frob_update_context) with new code (that doesn't).  One way to
accomplish this would be to set a dummy reg with initial CIE dwarf
instructions, then test this reg in frob_update_context.

Bootstrapped and regression tested powerpc64-linux.

* config/rs6000/linux-unwind.h (frob_update_context <__powerpc64__>):
Leave r2 REG_UNSAVED if stopped on the instruction that saves r2
in a plt call stub.  Do restore r2 if stopped on bctrl.

Index: libgcc/config/rs6000/linux-unwind.h
===
--- libgcc/config/rs6000/linux-unwind.h (revision 176780)
+++ libgcc/config/rs6000/linux-unwind.h (working copy)
@@ -346,10 +346,28 @@ frob_update_context (struct _Unwind_Cont
 figure out if it was saved.  The big problem here is that the
 code that does the save/restore is generated by the linker, so
 we have no good way to determine at compile time what to do.  */
-  unsigned int *insn
-   = (unsigned int *) _Unwind_GetGR (context, R_LR);
-  if (insn && *insn == 0xE8410028)
-   _Unwind_SetGRPtr (context, 2, context->cfa + 40);
+  if (pc[0] == 0xF8410028
+ || ((pc[0] & 0x) == 0x3D82
+ && pc[1] == 0xF8410028))
+   {
+ /* We are in a plt call stub or r2 adjusting long branch stub,
+before r2 has been saved.  Keep REG_UNSAVED.  */
+   }
+  else if (pc[0] == 0x4E800421
+  && pc[1] == 0xE8410028)
+   {
+ /* We are at the bctrl instruction in a call via function
+pointer.  gcc always emits the load of the new r2 just
+before the bctrl.  */
+ _Unwind_SetGRPtr (context, 2, context->cfa + 40);
+   }
+  else
+   {
+ unsigned int *insn
+   = (unsigned int *) _Unwind_GetGR (context, R_LR);
+ if (insn && *insn == 0xE8410028)
+   _Unwind_SetGRPtr (context, 2, context->cfa + 40);
+   }
 }
 #endif
 }


-- 
Alan Modra
Australia Development Lab, IBM


PATCH: PR target/49860: [x32] Error: cannot represent relocation type BFD_RELOC_64 in x32 mode

2011-07-26 Thread H.J. Lu
Hi,

The offsetted memory references always work for x32.  OK for trunk?

Thanks.

H.J.

2011-07-26  H.J. Lu  

PR target/49860
* config/i386/predicates.md (x86_64_immediate_operand): Always
allow the offsetted memory references for TARGET_X32.
(x86_64_zext_immediate_operand): Likewise.

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0515519..7dc690a 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -197,8 +197,10 @@
  if ((ix86_cmodel == CM_SMALL
   || (ix86_cmodel == CM_MEDIUM
   && !SYMBOL_REF_FAR_ADDR_P (op1)))
- && offset < 16*1024*1024
- && trunc_int_for_mode (offset, SImode) == offset)
+ && (TARGET_X32
+ || (offset < 16*1024*1024
+ && (trunc_int_for_mode (offset, SImode)
+ == offset
return true;
  /* For CM_KERNEL we know that all object resist in the
 negative half of 32bits address space.  We may not
@@ -302,8 +304,11 @@
   || (ix86_cmodel == CM_MEDIUM
   && !SYMBOL_REF_FAR_ADDR_P (op1)))
  && CONST_INT_P (op2)
- && trunc_int_for_mode (INTVAL (op2), DImode) > -0x1
- && trunc_int_for_mode (INTVAL (op2), SImode) == INTVAL (op2))
+ && (TARGET_X32
+ || ((trunc_int_for_mode (INTVAL (op2), DImode)
+  > -0x1)
+ && (trunc_int_for_mode (INTVAL (op2), SImode)
+ == INTVAL (op2)
return true;
  /* ??? For the kernel, we may accept adjustment of
 -0x1000, since we know that it will just convert


PATCH: Don't allow nonmemory_operand on movabs for x32

2011-07-26 Thread H.J. Lu
Hi,

For x32, movabs is only supported with register and constant operands.
OK for trunk?

Thanks.


H.J.
2011-07-26  H.J. Lu  

* config/i386/predicates.md (x86_64_movabs_operand): Don't allow 
nonmemory_operand for TARGET_X32.

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0515519..7dc690a 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -393,7 +398,8 @@
 ;; Return true if OP is nonmemory operand acceptable by movabs patterns.
 (define_predicate "x86_64_movabs_operand"
   (if_then_else (not (and (match_test "TARGET_64BIT")
- (match_test "flag_pic")))
+ (ior (match_test "flag_pic")
+  (match_test "TARGET_X32"
 (match_operand 0 "nonmemory_operand")
 (ior (match_operand 0 "register_operand")
 (and (match_operand 0 "const_double_operand")


Re: [PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-26 Thread Delesley Hutchins
Le-Chun added the additional routine to remove pointers from a set;
that code is unique to annotalysis.  I can't easily include a test
case, because the bug is difficult to trigger.  It occurs only when
there is a hash collision between two pointers in the set, and the
first pointer is removed before the second.  I do have a test case,
but it will only work for my particular build on my machine, since the
actual pointer addresses involved will change as soon as you touch
something.  I could write a unit test using bogus pointer values that
are engineered to trigger a collision, but it wouldn't be a normal
compiler test case; where would I put it?

  -DeLesley

On Tue, Jul 26, 2011 at 5:59 PM, Diego Novillo  wrote:
> On Tue, Jul 26, 2011 at 16:13, Delesley Hutchins  wrote:
>> This patch fixes a bug in pointer_set.c, where removing a pointer from
>> a pointer set would corrupt the hash table if the pointer was involved
>> in any hash collisions.
>
> Could you include a test case?  It's not clear to me what you are
> fixing and when this happens.  Is this a bug in trunk as well?  The
> pointer-set implementation has been around for a while, so I'm
> surprised that you are running into this now.  Or is this something
> that only happens with the pointer set changes we have in for
> annotalysis?
>
>
> Thanks.  Diego.
>



-- 
DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315


[cxx-mem-model] __sync_mem builtin support patch 2/3 - code

2011-07-26 Thread Andrew MacLeod
This is the main patch which implements all the code for the new 
__sync_mem routines which take a memory model as a parameter.


I used the previously approved and checked in __sync_mem_exchange 
routine as the model and added all the rest. The only difference is I'm 
not adding the x86 patterns yet. I decided to implement just the generic 
routines first so that we will know they all pass the tests on their 
own. Then we'll get the target specific patterns for x86, ppc, arm and 
whatever else people are worried about seperately..


Once these are in place, the c++ atomic.h wrappers can be changed to use 
them as and we'll finally have C++0x support.


A couple of caveats.

 * __sync_mem_compare_exchange has the skeleton in place, but not the 
guts.  There are some issues that rth and I will work out later, I just 
don't want to hold up the rest of the patch for that. Right now it will 
fail the compare_exchange tests.


 * I will revisit exactly where synchronization fences need to be 
issued for each of these routines later as well. This is a first cut and 
again I want to get the code into the codebase so other things can get 
started. Fine tuning can be made later.


Bootstraps on x86-64-unknown-linux-gnu and causes no new regression.

Ok for the branch?



* expr.h (expand_sync_mem_exchange): Change parameter order.
(expand_sync_mem_*): New prototypes.
* optabs.h (DOI_sync_mem_*): Add new optable enums.
(sync_mem_*_optab): Add new #defines for table entries.
* genopinit.c (const optabs[]): Add direct optab handlers.
* optabs.c (expand_sync_mem_exchange): Change parameter order, and use
mem_thread_fence if it exists.
(expand_sync_mem_compare_exchange, expand_sync_mem_load,
expand_sync_mem_store, expand_sync_mem_fetch_op): New. Expand
__sync_mem functions which handle multiple integral types.
* builtins.c (maybe_convert_modes): New. Factor out common code for
ensuring an integer argument is in the proper mode.
(expand_builtin_sync_operation, expand_builtin_compare_and_swap,
expand_builtin_sync_lock_test_and_set): Use maybe_convert_modes.
(expand_builtin_sync_lock_release): Relocate higher in the file.
(get_memmodel): Don't assume the memmodel is the 3rd argument.
(expand_builtin_sync_mem_exchange): Change error check and use
maybe_convert_modes.
(expand_builtin_sync_mem_compare_exchange): New.
(expand_builtin_sync_mem_load, expand_builtin_sync_mem_store): New.
(expand_builtin_sync_mem_fetch_op): New.
(expand_builtin_sync_mem_flag_test_and_set): New.
(expand_builtin_sync_mem_flag_clear): New.
(expand_builtin_sync_mem_thread_fence): New.
(expand_builtin_sync_mem_signal_fence): New.
(expand_builtin): Handle BUILT_IN_SYNC_MEM_* types.
* c-family/c-common.c (resolve_overloaded_builtin): Handle
BUILT_IN_SYNC_MEM_* types.
* builtin-types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New builtin type.
(BT_FN_VOID_VPTR_INT, BT_FN_BOOL_VPTR_INT): New builtin types.
(BT_FN_VOID_VPTR_I{1,2,4,8,16}_INT: New builtin type.
(BT_FN_BOOL_VPTR_PTR_I{1,2,4,8,16}_INT_INT): New builtin type.
* fortran/types.def (BT_FN_VOID_INT): New type.
(BT_FN_I{1,2,4,8,16}_VPTR_INT): New builtin type.
(BT_FN_VOID_VPTR_INT, BT_FN_BOOL_VPTR_INT): New builtin types.
(BT_FN_VOID_VPTR_I{1,2,4,8,16}_INT: New builtin type.
(BT_FN_BOOL_VPTR_PTR_I{1,2,4,8,16}_INT_INT): New builtin type.
* sync-builtins.def (BUILT_IN_SYNC_MEM_*): New sync builtins.

Index: expr.h
===
*** expr.h  (revision 175331)
--- expr.h  (working copy)
*** rtx expand_bool_compare_and_swap (rtx, r
*** 217,223 
  rtx expand_sync_operation (rtx, rtx, enum rtx_code);
  rtx expand_sync_fetch_operation (rtx, rtx, enum rtx_code, bool, rtx);
  rtx expand_sync_lock_test_and_set (rtx, rtx, rtx);
! rtx expand_sync_mem_exchange (enum memmodel, rtx, rtx, rtx);
  
  /* Functions from expmed.c:  */
  
--- 217,234 
  rtx expand_sync_operation (rtx, rtx, enum rtx_code);
  rtx expand_sync_fetch_operation (rtx, rtx, enum rtx_code, bool, rtx);
  rtx expand_sync_lock_test_and_set (rtx, rtx, rtx);
! 
! rtx expand_sync_mem_exchange (rtx, rtx, rtx, enum memmodel);
! rtx expand_sync_mem_compare_exchange (rtx, rtx, rtx, rtx, enum memmodel, 
! enum memmodel);
! rtx expand_sync_mem_load (rtx, rtx, enum memmodel);
! void expand_sync_mem_store (rtx, rtx, enum memmodel);
! rtx expand_sync_mem_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel);
! rtx expand_sync_mem_flag_test_and_set (rtx, rtx, enum memmodel);
! void expand_sync_mem_flag_clear (rtx, enum memmodel);
! void expand_sync_mem_thread_fence (enum memmodel);
! void expand_sync_mem_signal_fence (enum memmodel);
! 
  
  /* Function

[cxx-mem-model] __sync_mem builtin support patch 1/3 - documentation

2011-07-26 Thread Andrew MacLeod
This patch is simply the documentation for extend.texi which adds a 
section about the new memory model __sync_mem routines.  I've supplied 
the .info output since its easier to read, followed by the patch


OK for the branch?

Andrew

6.52 Built-in functions for memory model aware atomic operations.
=

 The following builtins approximately match the requirements for
C++1x memory model. Many are similar to the "__sync" prefixed builtins,
but all also have a memory model parameter.  These are all identified
by being prefixed with "__sync_mem", and most are overloaded such that
they work with multiple types.

 GCC will allow any integral scalar or pointer type that is 1, 2,
4, or 8 bytes in length. 16 bytes integral types are also allowed if
__int128_t is supported by the architecture.

 Target architectures are encouraged to provide their own patterns
for each of these builtins.  If no target is provided, the original
non-memory model set of "__sync" atomic builtins will be utilized,
along with any required synchronization fences surrounding it in order
to achieve the proper behaviour.  Execution in this case is subject to
the same restrictions as those builtins.

 There are 6 different memory models which can be specified.  These
map to the same names in the C++1x standard.  Refer there or to the GCC
wiki on atomics for more detailed definitions.  These memory models
integrate both barriers to code motion as well as synchronization
requirements with other threads. These are listed in approximately
ascending order of strength.

`__SYNC_MEM_RELAXED'
  No barrier or synchronization.

`__SYNC_MEM_CONSUME'
  Data dependency only for both barrier and synchronization
  with another thread.

`__SYNC_MEM_ACQUIRE'
  Barrier to hoisting of code and synchronizes with stores from
  another thread.

`__SYNC_MEM_RELEASE'
  Barrier to sinking of code and synchronizes with loads from
  another thread.

`__SYNC_MEM_ACQ_REL'
  Full barrier in both directions and synchronizes with loads
  and stores in another thread.

`__SYNC_MEM_SEQ_CST'
  Full barrier in both directions and synchronizes all loads
  and stores in all threads.

 When implementing patterns for these builtins, the memory model
parameter can be ignored as long as the pattern implements the most
restrictive __SYNC_MEM_SEQ_CST model. Any of the other memory models
will execute correctly with this memory model but they may not execute
as efficiently as they could with a more appropriate implemention of
the relaxed requirements.

`TYPE __sync_mem_load (TYPE *ptr, int memmodel)'
 This builtin implements an atomic load operation.  It returns the
 contents of `*PTR'.

 The valid memory model variants are __SYNC_MEM_RELAXED,
 __SYNC_MEM_SEQ_CST, __SYNC_MEM_ACQUIRE, and __SYNC_MEM_CONSUME.

`void __sync_mem_store (TYPE *ptr, TYPE val, int memmodel)'
 This builtin implements an atomic store operation.  It writes `VAL'
 into `*PTR'.

 The valid memory model variants are __SYNC_MEM_RELAXED,
 __SYNC_MEM_SEQ_CST, and __SYNC_MEM_RELEASE.

`TYPE __sync_mem_exchange (TYPE *ptr, TYPE val, int memmodel)'
 This builtin implements an atomic exchange operation.  It writes
 VAL into `*PTR', and returns the previous contents of `*PTR'.

 The valid memory model variants are __SYNC_MEM_RELAXED,
 __SYNC_MEM_SEQ_CST, __SYNC_MEM_ACQUIRE, __SYNC_MEM_RELEASE, and
 __SYNC_MEM_ACQ_REL.

`bool __sync_mem_compare_exchange (TYPE *ptr, TYPE *expected, TYPE desired, int 
success_memmodel, int failure_memmodel)'
 This builtin implements an atomic compare_exchange operation.
 This compares the contents of `*PTR' with the contents of
 `*EXPECTED' and if equal, writes DESIRED into `*PTR'.  If they are
 not equal, the current contents of `*PTR' is written into
 `*EXPECTED'.

 True is returned if `*DESIRED' is written into `*PTR' and the
 execution is considered to conform to the memory model specified by
 SUCCESS_MEMMODEL.  There are no restrictions on what memory model
 can be used here.

 False is returned otherwise, and the execution is considered to
 conform to FAILURE_MEMMODEL. This memory model cannot be
 __SYNC_MEM_RELEASE nor __SYNC_MEM_ACQ_REL.  It also cannot be a
 stronger model than that specified by SUCCESS_MEMMODEL.

`TYPE __sync_mem_fetch_add (TYPE *ptr, TYPE val, int memmodel)'
`TYPE __sync_mem_fetch_sub (TYPE *ptr, TYPE val, int memmodel)'
`TYPE __sync_mem_fetch_and (TYPE *ptr, TYPE val, int memmodel)'
`TYPE __sync_mem_fetch_xor (TYPE *ptr, TYPE val, int memmodel)'
`TYPE __sync_mem_fetch_or (TYPE *ptr, TYPE val, int memmodel)'
 These builtins perform the operation suggested by the name, and
 return the value that had previously been in *ptr .  That is,

  { tmp = *ptr; *ptr OP

Re: [PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-26 Thread Diego Novillo
On Tue, Jul 26, 2011 at 16:13, Delesley Hutchins  wrote:
> This patch fixes a bug in pointer_set.c, where removing a pointer from
> a pointer set would corrupt the hash table if the pointer was involved
> in any hash collisions.

Could you include a test case?  It's not clear to me what you are
fixing and when this happens.  Is this a bug in trunk as well?  The
pointer-set implementation has been around for a while, so I'm
surprised that you are running into this now.  Or is this something
that only happens with the pointer set changes we have in for
annotalysis?


Thanks.  Diego.


[PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-26 Thread Delesley Hutchins
This patch fixes a bug in pointer_set.c, where removing a pointer from
a pointer set would corrupt the hash table if the pointer was involved
in any hash collisions.

Bootstrapped and passed gcc regression testsuite on x86_64-unknown-linux-gnu.

Okay for google/gcc-4_6?

  -DeLesley

gcc/Changelog.annotalysis:
2011-7-26  DeLesley Hutchins  

* gcc/pointer-set.c (pointer_set_delete)  bugfix for case of
hash collisions


Index: gcc/pointer-set.c
===
--- gcc/pointer-set.c   (revision 176809)
+++ gcc/pointer-set.c   (working copy)
@@ -192,19 +192,16 @@ int
 pointer_set_delete (struct pointer_set_t *pset, const void *p)
 {
   size_t n = hash1 (p, pset->n_slots, pset->log_slots);
+  size_t n2;
+  const void* ptr;

+  /* find location of p */
   while (true)
 {
   if (pset->slots[n] == p)
-{
-  pset->slots[n] = 0;
-  --pset->n_elements;
-  return 1;
-}
+break;
   else if (pset->slots[n] == 0)
-{
-  return 0;
-}
+return 0;
   else
 {
   ++n;
@@ -212,6 +209,29 @@ pointer_set_delete (struct pointer_set_t *pset, co
 n = 0;
 }
 }
+
+  /* Remove p from set. */
+  pset->slots[n] = 0;
+
+  /* Now we need to scan foward and re-hash every value that we encounter,
+ until we find an empty slot.
+   */
+  while (true)
+{
+  ++n;
+  if (n >= pset->n_slots)
+n = 0;
+  ptr = pset->slots[n];
+
+  if (ptr == 0) break;
+
+  pset->slots[n] = 0;  /* remove ptr from set. */
+  n2 = insert_aux(ptr, pset->slots, pset->n_slots, pset->log_slots);
+  pset->slots[n2] = ptr;   /* put ptr back in set. */
+}
+
+  --pset->n_elements;
+  return 1;
 }

 /* Pass each pointer in PSET to the function in FN, together with the fixed


-- 
DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315


Re: [pph] Save pending and specialized templates (issue4814054)

2011-07-26 Thread Gabriel Charette
See comments inline.

> +
> +/* PPH write/read */
> +
> +
> +/* Emit a tinst_level list TINST to STREAM.  */
> +
> +static void
> +pph_out_tinst_level (pph_stream *stream, struct tinst_level *tinst)
> +{
> +  int count;
> +  struct tinst_level *cur;
> +
> +  /* Count the number of items.  */
> +  for (cur = tinst; cur != NULL;  cur = cur->next )
> +    ++count;
> +
> +  /* Now emit them.  */
> +  pph_out_uint (stream, count);
> +  for (cur = tinst; cur != NULL;  cur = cur->next )
> +    {
> +      pph_out_tree (stream, cur->decl);
> +      pph_out_location (stream, cur->locus);
> +      pph_out_uint (stream, cur->errors);
> +      pph_out_uint (stream, cur->in_system_header_p);
> +    }
> +}
> +
> +/* Dump a tinst_level list TINST to STREAM.  */
> +
> +static void
> +pph_dump_tinst_level (FILE *stream, struct tinst_level *tinst)
> +{
> +  int count;
> +  struct tinst_level *cur;
> +
> +  /* Count the number of items.  */
> +  for (cur = tinst; cur != NULL;  cur = cur->next )
> +    ++count;
> +
> +  /* Now dump them.  */
> +  fprintf (stream, "%d tinst_levels\n", count );
> +  for (cur = tinst; cur != NULL;  cur = cur->next )
> +    {
> +      pph_dump_tree_name (stream, cur->decl, 0);
> +      /* pph_dump_location (stream, cur->locus); */
> +      fprintf (stream, "%d errors, ", cur->errors );
> +      fprintf (stream, "%d in system header\n", cur->in_system_header_p );
> +    }
> +}
> +
> +/* Load a tinst_level list.  */
> +
> +static struct tinst_level *
> +pph_in_tinst_level (pph_stream *stream)
> +{
> +  struct tinst_level *last = NULL;
> +  unsigned count = pph_in_uint (stream);
> +  /* FIXME pph: This leaves the list in reverse order.  Issue?  */
> +  for (; count > 0; --count)
> +    {
> +      struct tinst_level *cur = ggc_alloc_tinst_level ();
> +      cur->next = last;

-      cur->next = last;
+ if(last)
+   last->next = cur;

> +      cur->decl = pph_in_tree (stream);
> +      cur->locus = pph_in_location (stream);
> +      cur->errors = pph_in_uint (stream);
> +      cur->in_system_header_p = pph_in_uint (stream);
> +      last = cur;
> +    }

if (last)
  last->next = NULL;

> +  return last;
> +}
> +

If you do the changes above, the list won't be in reverse order.

Also if you do it as I did in pph_in_cxx_binding, using the cache
markers, you do need to output count, you can simply use the cache
mechanism which will know how to output/input NULL (I used that in
pph_in/out_cxx_binding).

> +
> +/* Load and merge a spec_entry TABLE from STREAM.  */
> +
> +static void
> +pph_in_spec_entry_htab (pph_stream *stream, htab_t *table)
> +{
> +  spec_entry **slot = NULL;

This variable is shadowed by

> +  unsigned count = pph_in_uint (stream);
> +  if (flag_pph_debug >= 2)
> +    fprintf (stderr, "loading %d spec_entries\n", count );
> +  for (; count > 0; --count)
> +    {
> +      hashval_t hash;
> +      spec_entry **slot;

...this one??

> +      struct spec_entry *se = ggc_alloc_spec_entry ();
> +      se->tmpl = pph_in_tree (stream);
> +      se->args = pph_in_tree (stream);
> +      se->spec = pph_in_tree (stream);
> +      hash = hash_specialization (se);
> +      slot = htab_find_slot_with_hash (*table, se, hash, INSERT);
> +      *slot = se;
> +    }
> +}
> +



> Index: gcc/cp/pph-streamer.c
> ===
> --- gcc/cp/pph-streamer.c       (revision 176778)
> +++ gcc/cp/pph-streamer.c       (working copy)
> @@ -169,6 +169,7 @@ enum pph_trace_type
>     PPH_TRACE_UINT,
>     PPH_TRACE_BYTES,
>     PPH_TRACE_STRING,
> +    PPH_TRACE_LOCATION,
>     PPH_TRACE_CHAIN,
>     PPH_TRACE_BITPACK
>  };
> @@ -244,6 +245,14 @@ pph_trace (pph_stream *stream, const voi
>        fprintf (pph_logfile, ", NULL_STRING");
>       break;
>
> +    case PPH_TRACE_LOCATION:
> +      if (data)
> +       fprintf (pph_logfile, ", value=%.*s",
> +                              (int) nbytes, (const char *) data);
> +      else
> +       fprintf (pph_logfile, ", NULL_LOCATION");
> +      break;
> +
>     case PPH_TRACE_CHAIN:
>       {
>        const_tree t = (const_tree) data;
> @@ -316,6 +325,26 @@ pph_trace_string_with_length (pph_stream
>  }
>
>
> +/* Show tracing information for location_t LOC on STREAM.  */
> +
> +void
> +pph_trace_location (pph_stream *stream, location_t loc)
> +{
> +  char dec[10]; /* ten digits per file line number */
> +  expanded_location xloc = expand_location (loc);
> +  size_t flen = strlen (xloc.file);
> +  size_t mlen = flen + 12; /* for : and 10 digits and \n */
> +  size_t llen;
> +  char *str = xmalloc (mlen);
> +
> +  strcpy (str, xloc.file);
> +  str[flen] = ':';
> +  sprintf (str + flen + 1, "%d", xloc.line);
> +  llen = strlen (str);
> +  pph_trace (stream, str, llen, PPH_TRACE_LOCATION);
> +}
> +
> +
>  /* Show tracing information for a tree chain starting with T on STREAM.  */
>
>  void
> Index: gcc/cp/pph-streamer.h
> ===
> --- gcc/cp/pph-streame

[PATCH] Propagate source locations from function_decls to their template_decls

2011-07-26 Thread Jeffrey Yasskin
This patch copies the source location of a FUNCTION_DECL to the
TEMPLATE_DECL that build_template_decl() builds out of it. Otherwise,
the TEMPLATE_DECL's location becomes input_location, which is the end
of the parameter list, while the FUNCTION_DECL's location is the
location of the name of the function. Depending on what order
templates are defined and used, gcc may emit either the
FUNCTION_DECL's or TEMPLATE_DECL's location into the debug location,
which causes gold's ODR checker to emit false positives.

Tested with a bootstrap+`make -k check-c++` on
x86_64-unknown-linux-gnu. I'm looking to check it in to trunk, and
will propagate it to the gcc-4_6-branch if you think that's the right
thing to do.

No more tests fail than in
http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg02995.html.

gcc/cp/ChangeLog:
2011-07-26   Jeffrey Yasskin  

* pt.c (build_template_decl): Copy the function_decl's source
location to the new template_decl.

gcc/testsuite/ChangeLog:
2011-07-26   Jeffrey Yasskin  

* g++.old-deja/g++.pt/crash60.C: Updated.

libstdc++-v3/ChangeLog:
2011-07-26   Jeffrey Yasskin  

* testsuite/20_util/weak_ptr/comparison/cmp_neg.cc: Updated.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 178685c..b9e09af 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4121,6 +4121,7 @@ build_template_decl (tree decl, tree parms, bool member_template_p)
   tree tmpl = build_lang_decl (TEMPLATE_DECL, DECL_NAME (decl), NULL_TREE);
   DECL_TEMPLATE_PARMS (tmpl) = parms;
   DECL_CONTEXT (tmpl) = DECL_CONTEXT (decl);
+  DECL_SOURCE_LOCATION (tmpl) = DECL_SOURCE_LOCATION (decl);
   DECL_MEMBER_TEMPLATE_P (tmpl) = member_template_p;
 
   return tmpl;
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/crash60.C b/gcc/testsuite/g++.old-deja/g++.pt/crash60.C
index 747af9b..1be4678 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/crash60.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/crash60.C
@@ -5,9 +5,9 @@
 // We ICE'd rather than fail to instantiate.
 
 template< typename SID, class SDR >
-void k( SID sid, SDR* p,
+void k( SID sid, SDR* p,	// { dg-error "no type named 'T'" }
  void (SDR::*)
- ( typename SID::T ) );		// { dg-error "no type named 'T'" }
+ ( typename SID::T ) );
 
 struct E { };
 struct S { void f( int ); };
diff --git a/libstdc++-v3/testsuite/20_util/weak_ptr/comparison/cmp_neg.cc b/libstdc++-v3/testsuite/20_util/weak_ptr/comparison/cmp_neg.cc
index df18712..6eecc2d 100644
--- a/libstdc++-v3/testsuite/20_util/weak_ptr/comparison/cmp_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/weak_ptr/comparison/cmp_neg.cc
@@ -44,16 +44,16 @@ main()
 
 // { dg-warning "note" "" { target *-*-* } 370 }
 // { dg-warning "note" "" { target *-*-* } 365 }
-// { dg-warning "note" "" { target *-*-* } 357 }
+// { dg-warning "note" "" { target *-*-* } 356 }
 // { dg-warning "note" "" { target *-*-* } 1103 }
 // { dg-warning "note" "" { target *-*-* } 1098 }
-// { dg-warning "note" "" { target *-*-* } 1090 }
+// { dg-warning "note" "" { target *-*-* } 1089 }
 // { dg-warning "note" "" { target *-*-* } 485 }
 // { dg-warning "note" "" { target *-*-* } 479 }
-// { dg-warning "note" "" { target *-*-* } 469 }
-// { dg-warning "note" "" { target *-*-* } 814 }
-// { dg-warning "note" "" { target *-*-* } 1056 }
-// { dg-warning "note" "" { target *-*-* } 1050 }
-// { dg-warning "note" "" { target *-*-* } 342 }
-// { dg-warning "note" "" { target *-*-* } 292 }
+// { dg-warning "note" "" { target *-*-* } 468 }
+// { dg-warning "note" "" { target *-*-* } 813 }
+// { dg-warning "note" "" { target *-*-* } 1055 }
+// { dg-warning "note" "" { target *-*-* } 1049 }
+// { dg-warning "note" "" { target *-*-* } 341 }
+// { dg-warning "note" "" { target *-*-* } 291 }
 // { dg-warning "note" "" { target *-*-* } 224 }


Re: [PATCH] Fix one .debug_macro bug and fix -g3 on non-HAVE_AS_DWARF2_DEBUG_LINE (both .debug_macro and .debug_macinfo)

2011-07-26 Thread Richard Henderson
On 07/26/2011 01:38 PM, Jakub Jelinek wrote:
>   * dwarf2out.c (output_macinfo_op): Ensure fd->filename points
>   to GC allocated copy of the string.
>   (dwarf2out_finish): Emit .debug_macinfo or .debug_macro sections
>   before .debug_line, not after it.

Ok.


r~


Re: [PATCH] Fix one .debug_macro bug and fix -g3 on non-HAVE_AS_DWARF2_DEBUG_LINE (both .debug_macro and .debug_macinfo)

2011-07-26 Thread Jakub Jelinek
On Tue, Jul 26, 2011 at 10:38:09PM +0200, Jakub Jelinek wrote:
> With that I've discovered that lookup_filename assumes that the string
> it is called in is kept around, as it stores just the pointer and not a copy
> of that string.  It seems all other places that call lookup_filename already
> call it with somehow persistent string and before my .debug_macro patch so
> did output_macinfo, because it never freed the malloced strings.
> The reason I've added the freeing was that I sometimes reuse the pointers
> for something else (the transparent_include group names) and valgrind etc.
> would be unhappy about unreachable malloced blocks not being freed, albeit
> so close to the end of the compilation.
> Instead of copying the string always the following patch just copies it
> if it stored that pointer (first hunk).

Forgot to add, what this newly introduced bug caused was that when using
.file/.loc directives if the same file was included more than once the
second and following copy often wouldn't share the same .file number
with the older one and thus .debug_line grew unnecessarily.  And, it could
rarely happen that it would use a wrong file name table index too, if the
freed memory contained something that would match some filename.  And of
course could crash as any reads from freed memory.

Jakub


[google] Backport r176188, r176489 from google/main to google/gcc-4_6 branch.

2011-07-26 Thread Delesley Hutchins
Committed.

-- 
DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315


Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 10:33 PM, H.J. Lu  wrote:
> On Tue, Jul 26, 2011 at 1:29 PM, Jakub Jelinek  wrote:
>> On Tue, Jul 26, 2011 at 10:21:11PM +0200, Uros Bizjak wrote:
>>> This also works, we look at orig_x that looks like:
>>>
>>> (mem/u/c:SI (const:DI (unspec:DI [
>>>                 (symbol_ref:SI ("__sflush") [flags 0x41]
>>> )
>>>             ] UNSPEC_GOTPCREL)) [2 S4 A8])
>>>
>>> So, we look at SImode load, and compare it with SImode (actually
>>> ptr_mode) symbol. Will your suggestion work with this RTX?
>>
>> Then
>>      if (GET_MODE (orig_x) != GET_MODE (x))
>>        {
>>          x = simplify_gen_subreg (GET_MODE (orig_x), x, GET_MODE (x), 0);
>>          if (x == NULL_RTX)
>>            return orig_x;
>>        }
>> will work, orig_x is the above SImode MEM, x is (symbol_ref:SI ("__sflush")
>> [flags 0x41] )
>> thus the modes are the same and no simplify_gen_subreg needs to be done, the
>> mode is already right.
>>
>
> This works for my testcase. I will do a full test.

Also OK for mainline, wih suitable ChangeLog and bootstrap/regression test.

BTW: I'm thinking of removing this check from ix86_expand_move:

@@ -15034,7 +15034,6 @@ ix86_expand_move (enum machine_mode mode
 }

   if ((flag_pic || MACHOPIC_INDIRECT)
-  && (mode == SImode || mode == DImode)
   && symbolic_operand (op1, mode))
 {
   if (TARGET_MACHO && !TARGET_64BIT)

There is no way symbolic_operand would be in different mode than SImode/DImode.

Uros.


[PATCH] Fix one .debug_macro bug and fix -g3 on non-HAVE_AS_DWARF2_DEBUG_LINE (both .debug_macro and .debug_macinfo)

2011-07-26 Thread Jakub Jelinek
Hi!

I've noticed that -g3 (both old .debug_macinfo and new .debug_macro) doesn't
work correctly when HAVE_AS_DWARF2_DEBUG_LINE isn't defined, because
the .debug_line section is emitted before .debug_mac{ro,info} and thus
if any DW_MACINFO_start_file/DW_MACRO_GNU_start_file ops need a file
that wasn't seen so far, it will reference a .debug_line filename table
entry that isn't present.
Fixed by the second and third hunk, by just emitting .debug_line after
.debug_macro/.debug_macinfo.

With that I've discovered that lookup_filename assumes that the string
it is called in is kept around, as it stores just the pointer and not a copy
of that string.  It seems all other places that call lookup_filename already
call it with somehow persistent string and before my .debug_macro patch so
did output_macinfo, because it never freed the malloced strings.
The reason I've added the freeing was that I sometimes reuse the pointers
for something else (the transparent_include group names) and valgrind etc.
would be unhappy about unreachable malloced blocks not being freed, albeit
so close to the end of the compilation.
Instead of copying the string always the following patch just copies it
if it stored that pointer (first hunk).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-07-26  Jakub Jelinek  

* dwarf2out.c (output_macinfo_op): Ensure fd->filename points
to GC allocated copy of the string.
(dwarf2out_finish): Emit .debug_macinfo or .debug_macro sections
before .debug_line, not after it.

--- gcc/dwarf2out.c.jj  2011-07-25 11:28:32.0 +0200
+++ gcc/dwarf2out.c 2011-07-26 16:19:56.0 +0200
@@ -20552,11 +20552,15 @@ output_macinfo_op (macinfo_entry *ref)
   size_t len;
   struct indirect_string_node *node;
   char label[MAX_ARTIFICIAL_LABEL_BYTES];
+  struct dwarf_file_data *fd;
 
   switch (ref->code)
 {
 case DW_MACINFO_start_file:
-  file_num = maybe_emit_file (lookup_filename (ref->info));
+  fd = lookup_filename (ref->info);
+  if (fd->filename == ref->info)
+   fd->filename = ggc_strdup (fd->filename);
+  file_num = maybe_emit_file (fd);
   dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
   dw2_asm_output_data_uleb128 (ref->lineno,
   "Included from line number %lu", 
@@ -22637,6 +22641,16 @@ dwarf2out_finish (const char *filename)
   output_ranges ();
 }
 
+  /* Have to end the macro section.  */
+  if (debug_info_level >= DINFO_LEVEL_VERBOSE)
+{
+  switch_to_section (debug_macinfo_section);
+  ASM_OUTPUT_LABEL (asm_out_file, macinfo_section_label);
+  if (!VEC_empty (macinfo_entry, macinfo_table))
+   output_macinfo ();
+  dw2_asm_output_data (1, 0, "End compilation unit");
+}
+
   /* Output the source line correspondence table.  We must do this
  even if there is no line information.  Otherwise, on an empty
  translation unit, we will generate a present, but empty,
@@ -22648,16 +22662,6 @@ dwarf2out_finish (const char *filename)
   if (! DWARF2_ASM_LINE_DEBUG_INFO)
 output_line_info ();
 
-  /* Have to end the macro section.  */
-  if (debug_info_level >= DINFO_LEVEL_VERBOSE)
-{
-  switch_to_section (debug_macinfo_section);
-  ASM_OUTPUT_LABEL (asm_out_file, macinfo_section_label);
-  if (!VEC_empty (macinfo_entry, macinfo_table))
-output_macinfo ();
-  dw2_asm_output_data (1, 0, "End compilation unit");
-}
-
   /* If we emitted any DW_FORM_strp form attribute, output the string
  table too.  */
   if (debug_str_hash)

Jakub


Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 1:29 PM, Jakub Jelinek  wrote:
> On Tue, Jul 26, 2011 at 10:21:11PM +0200, Uros Bizjak wrote:
>> This also works, we look at orig_x that looks like:
>>
>> (mem/u/c:SI (const:DI (unspec:DI [
>>                 (symbol_ref:SI ("__sflush") [flags 0x41]
>> )
>>             ] UNSPEC_GOTPCREL)) [2 S4 A8])
>>
>> So, we look at SImode load, and compare it with SImode (actually
>> ptr_mode) symbol. Will your suggestion work with this RTX?
>
> Then
>      if (GET_MODE (orig_x) != GET_MODE (x))
>        {
>          x = simplify_gen_subreg (GET_MODE (orig_x), x, GET_MODE (x), 0);
>          if (x == NULL_RTX)
>            return orig_x;
>        }
> will work, orig_x is the above SImode MEM, x is (symbol_ref:SI ("__sflush")
> [flags 0x41] )
> thus the modes are the same and no simplify_gen_subreg needs to be done, the
> mode is already right.
>

This works for my testcase. I will do a full test.

Thanks.


-- 
H.J.


Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread Jakub Jelinek
On Tue, Jul 26, 2011 at 10:21:11PM +0200, Uros Bizjak wrote:
> This also works, we look at orig_x that looks like:
> 
> (mem/u/c:SI (const:DI (unspec:DI [
> (symbol_ref:SI ("__sflush") [flags 0x41]
> )
> ] UNSPEC_GOTPCREL)) [2 S4 A8])
> 
> So, we look at SImode load, and compare it with SImode (actually
> ptr_mode) symbol. Will your suggestion work with this RTX?

Then 
  if (GET_MODE (orig_x) != GET_MODE (x))
{
  x = simplify_gen_subreg (GET_MODE (orig_x), x, GET_MODE (x), 0);
  if (x == NULL_RTX)
return orig_x;
}
will work, orig_x is the above SImode MEM, x is (symbol_ref:SI ("__sflush")
[flags 0x41] )
thus the modes are the same and no simplify_gen_subreg needs to be done, the
mode is already right.

Jakub


Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 10:12 PM, Jakub Jelinek  wrote:
> On Tue, Jul 26, 2011 at 10:05:06PM +0200, Uros Bizjak wrote:
>> > 2011-07-26  H.J. Lu  
>> >
>> >        PR target/47372
>> >        * config/i386/i386.c (ix86_delegitimize_address): Call
>> >        simplify_gen_subreg for PIC with ptr_mode only if modes of
>> >        x and orig_x are different.
>> >
>> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> > index 429cd62..9c52aa3 100644
>> > --- a/gcc/config/i386/i386.c
>> > +++ b/gcc/config/i386/i386.c
>> > @@ -12967,9 +12982,10 @@ ix86_delegitimize_address (rtx x)
>> >          || !MEM_P (orig_x))
>> >        return ix86_delegitimize_tls_address (orig_x);
>> >       x = XVECEXP (XEXP (x, 0), 0, 0);
>
> When x is no longer known to be Pmode
>
>> > -      if (GET_MODE (orig_x) != Pmode)
>> > +      if (GET_MODE (orig_x) != GET_MODE (x)
>> > +         && GET_MODE (orig_x) != ptr_mode)
>
> why not simply just
>        if (GET_MODE (orig_x) != GET_MODE (x))
>
>> >        {
>> > -         x = simplify_gen_subreg (GET_MODE (orig_x), x, Pmode, 0);
>> > +         x = simplify_gen_subreg (GET_MODE (orig_x), x, ptr_mode, 0);
>
> and using GET_MODE (x) instead of Pmode/ptr_mode here?  I mean,
> x is certainly not VOIDmode here, should be either SImode or DImode
> and thus simplify_gen_subreg ought to work for it.

This also works, we look at orig_x that looks like:

(mem/u/c:SI (const:DI (unspec:DI [
(symbol_ref:SI ("__sflush") [flags 0x41]
)
] UNSPEC_GOTPCREL)) [2 S4 A8])

So, we look at SImode load, and compare it with SImode (actually
ptr_mode) symbol. Will your suggestion work with this RTX?

Thanks,
Uros.


[PATCH] Improve call site argument debug info for floating point stack arguments (PR debug/49846)

2011-07-26 Thread Jakub Jelinek
Hi!

Double arguments passed on the stack on x86_64 (and float too) where
a function is called with a constant is stored using corresponding integer
mode rather than DFmode, so cselib_lookup doesn't find the preserved value
for this.  Fixed thusly, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2011-07-26  Jakub Jelinek  

PR debug/49846
* var-tracking.c (prepare_call_arguments): For non-MODE_INT stack
arguments also check if they aren't initialized with a MODE_INT
mode of the same size.

--- gcc/var-tracking.c.jj   2011-07-22 22:15:02.0 +0200
+++ gcc/var-tracking.c  2011-07-26 15:51:35.0 +0200
@@ -5777,6 +5777,22 @@ prepare_call_arguments (basic_block bb, 
val = cselib_lookup (mem, GET_MODE (mem), 0, VOIDmode);
if (val && cselib_preserved_value_p (val))
  item = gen_rtx_CONCAT (GET_MODE (x), copy_rtx (x), val->val_rtx);
+   else if (GET_MODE_CLASS (GET_MODE (mem)) != MODE_INT)
+ {
+   /* For non-integer stack argument see also if they weren't
+  initialized by integers.  */
+   enum machine_mode imode = int_mode_for_mode (GET_MODE (mem));
+   if (imode != GET_MODE (mem) && imode != BLKmode)
+ {
+   val = cselib_lookup (adjust_address_nv (mem, imode, 0),
+imode, 0, VOIDmode);
+   if (val && cselib_preserved_value_p (val))
+ item = gen_rtx_CONCAT (GET_MODE (x), copy_rtx (x),
+lowpart_subreg (GET_MODE (x),
+val->val_rtx,
+imode));
+ }
+ }
  }
if (item)
  call_arguments = gen_rtx_EXPR_LIST (VOIDmode, item, call_arguments);

Jakub


Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread Jakub Jelinek
On Tue, Jul 26, 2011 at 10:05:06PM +0200, Uros Bizjak wrote:
> > 2011-07-26  H.J. Lu  
> >
> >        PR target/47372
> >        * config/i386/i386.c (ix86_delegitimize_address): Call
> >        simplify_gen_subreg for PIC with ptr_mode only if modes of
> >        x and orig_x are different.
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 429cd62..9c52aa3 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -12967,9 +12982,10 @@ ix86_delegitimize_address (rtx x)
> >          || !MEM_P (orig_x))
> >        return ix86_delegitimize_tls_address (orig_x);
> >       x = XVECEXP (XEXP (x, 0), 0, 0);

When x is no longer known to be Pmode

> > -      if (GET_MODE (orig_x) != Pmode)
> > +      if (GET_MODE (orig_x) != GET_MODE (x)
> > +         && GET_MODE (orig_x) != ptr_mode)

why not simply just
if (GET_MODE (orig_x) != GET_MODE (x))

> >        {
> > -         x = simplify_gen_subreg (GET_MODE (orig_x), x, Pmode, 0);
> > +         x = simplify_gen_subreg (GET_MODE (orig_x), x, ptr_mode, 0);

and using GET_MODE (x) instead of Pmode/ptr_mode here?  I mean,
x is certainly not VOIDmode here, should be either SImode or DImode
and thus simplify_gen_subreg ought to work for it.

> >          if (x == NULL_RTX)
> >            return orig_x;
> >        }
> >

Jakub


[Patch, Fortran] PR 49755 - Multiple allocations.

2011-07-26 Thread Daniel Carrera
The attached patch fixes PR 49755, allowing GFortran to behave correctly 
when faced with multiple allocations:



allocate(A(20,20))
A = 42

! Allocate of already allocated variable
allocate (A(5,5), stat=stat)


The patch fixes an error in the test suite (multiple_allocation_1.f90) 
and introduces a new test for the suite (attached). The ChangeLog is 
also attached. The ChangeLog has two parts, which are set to go to 
gcc/fortran and gcc/testsuite respectively.


Ok for trunk?


Cheers,
Daniel.
--
I'm not overweight, I'm undertall.
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c	(revision 176622)
+++ gcc/fortran/trans-array.c	(working copy)
@@ -4146,3 +4146,3 @@ gfc_conv_descriptor_cosize (tree desc, i
 	a.stride[n] = stride;
-	size = siz >= 0 ? ubound + size : 0; //size = ubound + 1 - lbound
+	size = size >= 0 ? ubound + size : 0; //size = ubound + 1 - lbound
 	overflow += size == 0 ? 0: (MAX/size < stride ? 1: 0);
@@ -4164,4 +4164,4 @@ static tree
 gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
-		 gfc_expr ** lower, gfc_expr ** upper,
-		 stmtblock_t * pblock, tree * overflow)
+		 gfc_expr ** lower, gfc_expr ** upper, stmtblock_t * pblock,
+		 stmtblock_t * descriptor_block, tree * overflow)
 {
@@ -4191,3 +4191,3 @@ gfc_array_init_size (tree descriptor, in
   tmp = gfc_conv_descriptor_dtype (descriptor);
-  gfc_add_modify (pblock, tmp, gfc_get_dtype (TREE_TYPE (descriptor)));
+  gfc_add_modify (descriptor_block, tmp, gfc_get_dtype (TREE_TYPE (descriptor)));
 
@@ -4224,4 +4224,4 @@ gfc_array_init_size (tree descriptor, in
 	}
-  gfc_conv_descriptor_lbound_set (pblock, descriptor, gfc_rank_cst[n],
-  se.expr);
+  gfc_conv_descriptor_lbound_set (descriptor_block, descriptor, 
+  gfc_rank_cst[n], se.expr);
   conv_lbound = se.expr;
@@ -4240,3 +4240,3 @@ gfc_array_init_size (tree descriptor, in
 
-  gfc_conv_descriptor_ubound_set (pblock, descriptor,
+  gfc_conv_descriptor_ubound_set (descriptor_block, descriptor,
   gfc_rank_cst[n], se.expr);
@@ -4245,3 +4245,3 @@ gfc_array_init_size (tree descriptor, in
   /* Store the stride.  */
-  gfc_conv_descriptor_stride_set (pblock, descriptor,
+  gfc_conv_descriptor_stride_set (descriptor_block, descriptor,
   gfc_rank_cst[n], stride);
@@ -4305,4 +4305,4 @@ gfc_array_init_size (tree descriptor, in
 	}
-  gfc_conv_descriptor_lbound_set (pblock, descriptor, gfc_rank_cst[n],
-  se.expr);
+  gfc_conv_descriptor_lbound_set (descriptor_block, descriptor, 
+  gfc_rank_cst[n], se.expr);
 
@@ -4314,3 +4314,3 @@ gfc_array_init_size (tree descriptor, in
 	  gfc_add_block_to_block (pblock, &se.pre);
-	  gfc_conv_descriptor_ubound_set (pblock, descriptor,
+	  gfc_conv_descriptor_ubound_set (descriptor_block, descriptor,
 	  gfc_rank_cst[n], se.expr);
@@ -4397,2 +4397,4 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   tree cond;
+  tree set_descriptor;
+  stmtblock_t set_descriptor_block;
   stmtblock_t elseblock;
@@ -4463,5 +4465,8 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   overflow = integer_zero_node;
+
+  gfc_init_block (&set_descriptor_block);
   size = gfc_array_init_size (se->expr, ref->u.ar.as->rank,
 			  ref->u.ar.as->corank, &offset, lower, upper,
-			  &se->pre, &overflow);
+			  &se->pre, &set_descriptor_block, &overflow);
+
   if (dimension)
@@ -4493,3 +4498,3 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   gfc_start_block (&elseblock);
-  
+
   /* Allocate memory to store the data.  */
@@ -4500,11 +4505,6 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   if (allocatable)
-tmp = gfc_allocate_allocatable (&elseblock, pointer, size,
-status, errmsg, errlen, expr);
+gfc_allocate_allocatable (&elseblock, pointer, size,
+			  status, errmsg, errlen, expr);
   else
-tmp = gfc_allocate_using_malloc (&elseblock, size, status);
-
-  tmp = fold_build2_loc (input_location, MODIFY_EXPR, void_type_node,
-			 pointer, tmp);
-
-  gfc_add_expr_to_block (&elseblock, tmp);
+gfc_allocate_using_malloc (&elseblock, pointer, size, status);
 
@@ -4522,4 +4522,19 @@ gfc_array_allocate (gfc_se * se, gfc_exp
 
+  /* Update the array descriptors. */
   if (dimension)
-gfc_conv_descriptor_offset_set (&se->pre, se->expr, offset);
+gfc_conv_descriptor_offset_set (&set_descriptor_block, se->expr, offset);
+  
+  set_descriptor = gfc_finish_block (&set_descriptor_block);
+  if (status != NULL_TREE)
+{
+  cond = fold_build2_loc (input_location, EQ_EXPR,
+			  boolean_type_node, status,
+			  build_int_cst (TREE_TYPE (status), 0));
+  gfc_add_expr_to_block (&se->pre,
+		 fold_build3_loc (input_location, COND_EXPR, void_type_node,
+  gfc_likely (cond), set_descriptor,
+  build_empty_stmt (input_location))); 
+}
+  else
+  gfc_add_expr_to_block (&se->pre, set_descriptor);
 
Index: gcc/fortran/t

Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 9:41 PM, H.J. Lu  wrote:

> We should call simplify_gen_subreg for PIC with ptr_mode only if modes
> of x and orig_x are different.  OK for trunk?

Let's ask Jakub on this one...

Uros.

> 2011-07-26  H.J. Lu  
>
>        PR target/47372
>        * config/i386/i386.c (ix86_delegitimize_address): Call
>        simplify_gen_subreg for PIC with ptr_mode only if modes of
>        x and orig_x are different.
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 429cd62..9c52aa3 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12967,9 +12982,10 @@ ix86_delegitimize_address (rtx x)
>          || !MEM_P (orig_x))
>        return ix86_delegitimize_tls_address (orig_x);
>       x = XVECEXP (XEXP (x, 0), 0, 0);
> -      if (GET_MODE (orig_x) != Pmode)
> +      if (GET_MODE (orig_x) != GET_MODE (x)
> +         && GET_MODE (orig_x) != ptr_mode)
>        {
> -         x = simplify_gen_subreg (GET_MODE (orig_x), x, Pmode, 0);
> +         x = simplify_gen_subreg (GET_MODE (orig_x), x, ptr_mode, 0);
>          if (x == NULL_RTX)
>            return orig_x;
>        }
>


[patch] arm,rx: don't ICE on naked functions with local vars

2011-07-26 Thread DJ Delorie

This patch tests for at least one user-caused reason for this
assertion failing - requiring a local frame in a naked function.  For
this case at least, it would be better to trigger an error than to
ICE.  OK?

static int bar;
void __attribute__((naked)) function(void) {
   int foo, result;
   result = subFunction(&foo, &bar);   // ICE here
}

* expr.c (expand_expr_addr_expr_1): Detect a user request for
a local frame in a naked function, and produce a suitable
error for that specific case.

Index: expr.c
===
--- expr.c  (revision 176766)
+++ expr.c  (working copy)
@@ -6943,13 +6943,22 @@ expand_expr_addr_expr_1 (tree exp, rtx t
modifier == EXPAND_INITIALIZER
? EXPAND_INITIALIZER : EXPAND_CONST_ADDRESS);
 
  /* If the DECL isn't in memory, then the DECL wasn't properly
 marked TREE_ADDRESSABLE, which will be either a front-end
 or a tree optimizer bug.  */
- gcc_assert (MEM_P (result));
+
+ if (TREE_ADDRESSABLE (exp)
+ && ! MEM_P (result)
+ && ! targetm.calls.allocate_stack_slots_for_args())
+   {
+ error ("local frame unavailable (naked function?)");
+ return result;
+   }
+ else
+   gcc_assert (MEM_P (result));
  result = XEXP (result, 0);
 
  /* ??? Is this needed anymore?  */
  if (DECL_P (exp) && !TREE_USED (exp) == 0)
{
  assemble_external (exp);


[google] Fix lipo regression test failures after merge from trunk (issue4806053)

2011-07-26 Thread David Li
The patch is committed to google/main to fix lipo test regressions after trunk 
merge.

2011-07-26  David Li  

* value-prof.c (gimple_value_profile_transformations): Remove redundant 
code.
* cgraphunit.c (cgraph_mark_functions_to_output): Fix assertion in lipo 
mode.
* ipa-inline.c (early_inliner): Check fake edge.
* l-ipo.c (pop_module_scope): Process alias node.
(cgraph_unify_type_alias_sets): Skip empty function.
* testsuite/gcc.dg/tree-prof/lipo/val-prof-2_0.c: update test case

Index: value-prof.c
===
--- value-prof.c(revision 176763)
+++ value-prof.c(working copy)
@@ -613,18 +613,7 @@ gimple_value_profile_transformations (vo
 }
 
   if (changed)
-{
-  counts_to_freqs ();
-  /* Value profile transformations may change inline parameters
- a lot (e.g., indirect call promotion introduces new direct calls).
- The update is also needed to avoid compiler ICE -- when MULTI
- target icall promotion happens, the caller's size may become
- negative when the promoted direct calls get promoted.  */
-  /* Guard this for LIPO for now.  */
-  if (L_IPO_COMP_MODE)
-compute_inline_parameters (cgraph_get_node (current_function_decl),
-  false);
-}
+counts_to_freqs ();
 
   return changed;
 }
Index: cgraphunit.c
===
--- cgraphunit.c(revision 176763)
+++ cgraphunit.c(working copy)
@@ -1531,7 +1531,8 @@ cgraph_mark_functions_to_output (void)
  gcc_assert (node->global.inlined_to
  || !gimple_has_body_p (decl)
  || node->in_other_partition
- || DECL_EXTERNAL (decl));
+ || DECL_EXTERNAL (decl)
+  || cgraph_is_auxiliary (node->decl));
 
}
 
Index: testsuite/gcc.dg/tree-prof/lipo/val-prof-2_0.c
===
--- testsuite/gcc.dg/tree-prof/lipo/val-prof-2_0.c  (revision 176763)
+++ testsuite/gcc.dg/tree-prof/lipo/val-prof-2_0.c  (working copy)
@@ -26,7 +26,7 @@ main ()
 /* { dg-final-use { scan-ipa-dump "Mod power of 2 transformation on insn" 
"profile" } } */
 /* This is part of code checking that n is power of 2, so we are sure that the 
transformation
didn't get optimized out.  */
-/* { dg-final-use { scan-tree-dump "n_\[0-9\]* \\+ 0x" "optimized"} } */
+/* { dg-final-use { scan-tree-dump "n_\[0-9\]* \\+ (4294967295|0x0*)" 
"optimized"} } */
 /* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */
 /* { dg-final-use { cleanup-tree-dump "optimized" } } */
 /* { dg-final-use { cleanup-ipa-dump "profile" } } */
Index: ipa-inline.c
===
--- ipa-inline.c(revision 176763)
+++ ipa-inline.c(working copy)
@@ -2008,6 +2008,9 @@ early_inliner (void)
  for (edge = node->callees; edge; edge = edge->next_callee)
{
  struct inline_edge_summary *es = inline_edge_summary (edge);
+
+ if (!edge->call_stmt)
+   continue;
  es->call_stmt_size
= estimate_num_insns (edge->call_stmt, &eni_size_weights);
  es->call_stmt_time
Index: l-ipo.c
===
--- l-ipo.c (revision 176763)
+++ l-ipo.c (working copy)
@@ -390,6 +390,7 @@ pop_module_scope (void)
 primary_module_last_loc = input_location;
 
   at_eof = 1;
+  cgraph_process_same_body_aliases ();
   lang_hooks.l_ipo.process_pending_decls (input_location);
   lang_hooks.l_ipo.clear_deferred_fns ();
   at_eof = 0;
@@ -1067,7 +1068,8 @@ cgraph_unify_type_alias_sets (void)
 {
   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
   current_function_decl = node->decl;
-  cgraph_collect_type_referenced ();
+  if (gimple_has_body_p (current_function_decl))
+cgraph_collect_type_referenced ();
   current_function_decl = NULL;
   pop_cfun ();
 }

--
This patch is available for review at http://codereview.appspot.com/4806053


PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222

2011-07-26 Thread H.J. Lu
Hi,

Hi,

We should call simplify_gen_subreg for PIC with ptr_mode only if modes
of x and orig_x are different.  OK for trunk?

Thanks.


H.J.
---
2011-07-26  H.J. Lu  

PR target/47372
* config/i386/i386.c (ix86_delegitimize_address): Call
simplify_gen_subreg for PIC with ptr_mode only if modes of
x and orig_x are different.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 429cd62..9c52aa3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12967,9 +12982,10 @@ ix86_delegitimize_address (rtx x)
  || !MEM_P (orig_x))
return ix86_delegitimize_tls_address (orig_x);
   x = XVECEXP (XEXP (x, 0), 0, 0);
-  if (GET_MODE (orig_x) != Pmode)
+  if (GET_MODE (orig_x) != GET_MODE (x) 
+ && GET_MODE (orig_x) != ptr_mode)
{
- x = simplify_gen_subreg (GET_MODE (orig_x), x, Pmode, 0);
+ x = simplify_gen_subreg (GET_MODE (orig_x), x, ptr_mode, 0);
  if (x == NULL_RTX)
return orig_x;
}


[patch] RX: don't renumber interrupt registers

2011-07-26 Thread DJ Delorie

If a function is both a leaf function and an interrupt function, leaf
register renumbering causes the wrong set of registers to be saved.
This patch disables renumbering for interrupt functions.

* config/rx/rx.c (rx_leaf_registers): New.
(rx_set_leaf_registers): New.
(rx_expand_prologue): Call it.
* config/rx/rx.h (LEAF_REGISTERS): Define.
(LEAF_REG_REMAP): Define.

Index: gcc/config/rx/rx.c
===
--- gcc/config/rx/rx.c  (revision 176766)
+++ gcc/config/rx/rx.c  (working copy)
@@ -985,12 +985,24 @@ is_naked_func (const_tree decl)
 {
   return has_func_attr (decl, "naked");
 }
 
 static bool use_fixed_regs = false;
 
+char rx_leaf_registers [FIRST_PSEUDO_REGISTER];
+
+static void
+rx_set_leaf_registers (int enable)
+{
+  int i;
+
+  if (rx_leaf_registers[0] != enable)
+for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+  rx_leaf_registers[i] = enable;
+}
+
 static void
 rx_conditional_register_usage (void)
 {
   static bool using_fixed_regs = false;
 
   if (rx_small_data_limit > 0)
@@ -1380,12 +1392,17 @@ rx_expand_prologue (void)
   rtx insn;
 
   /* Naked functions use their own, programmer provided prologues.  */
   if (is_naked_func (NULL_TREE))
 return;
 
+  /* We must not allow register renaming in interrupt functions,
+ because that invalidates the correctness of the set of call-used
+ registers we're going to save/restore.  */
+  rx_set_leaf_registers (is_interrupt_func (NULL_TREE) ? 0 : 1);
+
   rx_get_stack_layout (& low, & high, & mask, & frame_size, & stack_size);
 
   /* If we use any of the callee-saved registers, save them now.  */
   if (mask)
 {
   /* Push registers in reverse order.  */
Index: gcc/config/rx/rx.h
===
--- gcc/config/rx/rx.h  (revision 176766)
+++ gcc/config/rx/rx.h  (working copy)
@@ -260,12 +260,17 @@ enum reg_class
 /* Order of allocation of registers.  */
 
 #define REG_ALLOC_ORDER\
 {  7,  10,  11,  12,  13,  14,  4,  3,  2,  1, 9, 8, 6, 5, 15  \
 }
 
+/* We must somehow disable register remapping for interrupt functions.  */
+extern char rx_leaf_registers[];
+#define LEAF_REGISTERS rx_leaf_registers
+#define LEAF_REG_REMAP(REG) (REG)
+
 #define REGNO_IN_RANGE(REGNO, MIN, MAX)\
   (IN_RANGE ((REGNO), (MIN), (MAX))\
|| (reg_renumber != NULL\
&& reg_renumber[(REGNO)] >= (MIN)   \
&& reg_renumber[(REGNO)] <= (MAX)))
 


Re: [C++0x] contiguous bitfields race implementation

2011-07-26 Thread Aldy Hernandez

On 07/25/11 18:55, Jason Merrill wrote:

On 07/25/2011 10:07 AM, Aldy Hernandez wrote:

I had changed this already to take into account aliasing, so if we get
an INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we
proceed with the restriction:


Sounds good. "global" includes malloc'd memory, right? There don't seem
to be any tests for that.


Is the attached test appropriate?
/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O2 --param allow-store-data-races=0" } */

#include 

struct bits
{
  char a;
  int b:7;
  int c:9;
  unsigned char d;
} x;

struct bits *p;

static void allocit()
{
  p = (struct bits *) malloc (sizeof (struct bits));
}

void foo()
{
  allocit();
  p -> c = 55;
}

/* { dg-final { scan-assembler-not "movl\t\\(" } } */


Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-26 Thread Joseph S. Myers
On Tue, 26 Jul 2011, Georg-Johann Lay wrote:

> I once ran into trouble because there seems to be no clear
> separation between undefinedness in C and undefinedness in RTL
> 
> Starting thread from here,
>   http://gcc.gnu.org/ml/gcc-help/2010-06/msg00191.html
> 
> the treads comes to this
>   http://gcc.gnu.org/ml/gcc-help/2010-06/msg00198.html
> 
> which includes a snip that shows that some RTL passes
> optimize on assumptions of undefinedness of C.
> I.e. undefinedness is propagated from C trough SSA until RTL.

That seems like a bug.  Flags such flag_wrapv relate to semantics of C and 
GIMPLE, not to RTL; (abs) RTL should always be wrapping.  (For shifts, 
SHIFT_COUNT_TRUNCATED describes the semantics of RTL.  For clz and ctz, 
CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO describe the 
semantics.  All these target macros only describe RTL, not C source or 
GIMPLE.  PR 30484 discusses how one might fix INT_MIN / -1 and INT_MIN % 
-1 for -fwrapv.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] Provide and use mmap effective-target keyword

2011-07-26 Thread Ulrich Weigand
Rainer Orth wrote:

> sure, that's one of the reasons to centralize the mmap test.  Simply
> replacing the body of the proc with
> 
> return [check_function_available "mmap"]
> 
> should work.  Could you give it a try?

Yes, this does work for me.

Thanks,
Ulrich


ChangeLog:

* lib/target-supports.exp (check_effective_target_mmap): Use
check_function_available.

Index: gcc/testsuite/lib/target-supports.exp
===
*** gcc/testsuite/lib/target-supports.exp   (revision 176798)
--- gcc/testsuite/lib/target-supports.exp   (working copy)
*** proc check_effective_target_fopenmp {} {
*** 700,708 
  # Return 1 if the target supports mmap, 0 otherwise.
  
  proc check_effective_target_mmap {} {
! return [check_no_compiler_messages mmap assembly {
!   #include 
! }]
  }
  
  # Return 1 if compilation with -pthread is error-free for trivial
--- 700,706 
  # Return 1 if the target supports mmap, 0 otherwise.
  
  proc check_effective_target_mmap {} {
! return [check_function_available "mmap"]
  }
  
  # Return 1 if compilation with -pthread is error-free for trivial


-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


[lra] patch to decrease code size degradation for ARM

2011-07-26 Thread Vladimir Makarov
The following patch decreases existing code size degradation on ARM by 
permitting a special case of early clobber without reloads, for example, 
for pattern


;; Use `&' and then `0' to prevent the operands 0 and 1 being the same
(define_insn "*arm_mulsi3"
  [(set (match_operand:SI  0 "s_register_operand" "=&r,&r")
(mult:SI (match_operand:SI 2 "s_register_operand" "r,r")
 (match_operand:SI 1 "s_register_operand" "%0,r")))]
  "TARGET_32BIT && !arm_arch6"
  "mul%?\\t%0, %2, %1"
  [(set_attr "insn" "mul")
   (set_attr "predicable" "yes")]
)

It required complete rewriting code dealing with early clobbers in LRA.

The patch was successfully bootstrapped on x86-64, ia64, and ppc.

2011-07-26  Vladimir Makarov 

* lra-assign.c (setup_live_pseudos_and_spill_after_equiv_moves):
Remove the dead code.

* lra-constraints.c (goal_alternative*): Rename to goal_alt...
(goal_early_clobbered_nops_num, goal_early_clobbered_nops):
Remove.
(goal_alt_dont_inherit_ops_num, goal_alt_dont_inherit_ops): New
static variables.
(uses_hard_regs_p): New function.
(process_alt_operands): Rename curr_alternative to curr_alt...
Rename curr_early_clobbered_regs[_num] to
early_clobbered_regs[_num].  Add code for matching with early
clobber.  Add code to process and evaluate conflicts with early
clobbers.  Set up goal_alt_dont_inherit_ops_num,
goal_alt_dont_inherit_ops.
(early_clobber_reload_regs_num, early_clobber_reload_regs):
Remove.
(make_early_clobber_input_reload_reg, search_and_replace_reg): 
Ditto.

s(create_early_clobber_reloads): Ditto.
(curr_insn_transform): Rename goal_alternative_matched to
goal_alt_matched.  Setup dont_inherit flag for early clobber
reload pseudo.  Don't call create_early_clobber_reloads.


Index: lra-assigns.c
===
--- lra-assigns.c   (revision 175931)
+++ lra-assigns.c   (working copy)
@@ -869,8 +869,6 @@
   curr_regno >= 0;
   curr_regno = lra_reg_info[curr_regno].next)
{
- IOR_HARD_REG_SET (conflict_set,
-   lra_reg_info[curr_regno].conflict_hard_regs);
  if (GET_MODE_SIZE (mode)
  < GET_MODE_SIZE (lra_reg_info[curr_regno].biggest_mode))
mode = lra_reg_info[curr_regno].biggest_mode;
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 176001)
+++ lra-constraints.c   (working copy)
@@ -802,7 +802,7 @@
   outmode = lra_get_mode (curr_static_id->operand[out].mode, out_rtx);
   if (inmode != outmode)
 {
-  /* Don't reuse the pseudos for inheritance -- they will bound.  */
+  /* Don't reuse the pseudos for inheritance -- they will be bound.  */
   get_reload_reg (OP_IN, inmode, in_rtx, goal_class, "", &new_in_reg);
   new_out_reg = lra_create_new_reg (outmode, out_rtx, goal_class, "");
 }
@@ -1060,26 +1060,24 @@
 
 /* The chosen reg class which should be used for the corresponding
operands.  */
-static enum reg_class goal_alternative[MAX_RECOG_OPERANDS];
+static enum reg_class goal_alt[MAX_RECOG_OPERANDS];
 /* True if the operand should be the same as another operand and the
another operand does not need a reload.  */
-static bool goal_alternative_match_win[MAX_RECOG_OPERANDS];
+static bool goal_alt_match_win[MAX_RECOG_OPERANDS];
 /* True if the operand does not need a reload.  */
-static bool goal_alternative_win[MAX_RECOG_OPERANDS];
+static bool goal_alt_win[MAX_RECOG_OPERANDS];
 /* True if the operand can be offsetable memory.  */
-static bool goal_alternative_offmemok[MAX_RECOG_OPERANDS];
+static bool goal_alt_offmemok[MAX_RECOG_OPERANDS];
 /* The number of operand to which given operand can be matched to.  */
-static int goal_alternative_matches[MAX_RECOG_OPERANDS];
+static int goal_alt_matches[MAX_RECOG_OPERANDS];
+/* The number of elements in the following array.  */
+static int goal_alt_dont_inherit_ops_num;
+/* Numbers of operands whose reload pseudos should not be inherited.  */
+static int goal_alt_dont_inherit_ops[MAX_RECOG_OPERANDS];
 /* True if the insn commutative operands should be swapped.  */
-static bool goal_alternative_swapped;
-/* The number of elements in the following two arrays.  */
-static int goal_early_clobbered_nops_num;
-/* Numbers of operands which are early clobbered registers.  */
-static int goal_early_clobbered_nops[MAX_RECOG_OPERANDS];
-/* Biggest of mode of the early clobbered registers.  */
-static enum machine_mode goal_early_clobbered_modes[MAX_RECOG_OPERANDS];
+static bool goal_alt_swapped;
 /* The chosen insn alternative.  */
-static int goal_alternative_number;
+static int goal_alt_number;
 
 /* The following four variables are used to choose the best insn
alternative.  They reflect finally characteristics o

Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-26 Thread Georg-Johann Lay
Richard Henderson wrote:
> On 07/26/2011 10:26 AM, Georg-Johann Lay wrote:
>> If -mint8 (word_mode = QImode) ever returns resp. is turned
>> functional again, then the QI version is undefined for
>> offsets >= 8  whereas the HI version is only undefined for
>> offsets >= 16.
> 
> It's undefined at the C level, not necessarily at the rtl level.
> 
> 
> r~

I once ran into trouble because there seems to be no clear
separation between undefinedness in C and undefinedness in RTL

Starting thread from here,
  http://gcc.gnu.org/ml/gcc-help/2010-06/msg00191.html

the treads comes to this
  http://gcc.gnu.org/ml/gcc-help/2010-06/msg00198.html

which includes a snip that shows that some RTL passes
optimize on assumptions of undefinedness of C.
I.e. undefinedness is propagated from C trough SSA until RTL.

In the particular case, an undefined (because of strict
overflow) abs in C was propagated to an undefined ABS in
RTL.

I regarded RTL rather as a high-level representation
of assembly than a low-level representation of C.

As far as I understand GCC, a front end emit it's code
like generic or gimple or whatever on the basis of input
language semantics, and behind that point there is a clear
semantics what, e.g.
  (set (reg) (plus (reg) (const_int 1)))
means.

Johann






[pph] Save pending and specialized templates (issue4814054)

2011-07-26 Thread Lawrence Crowl
This patch saves and restores three template side tables for PPH.  They are
the pending template instantiations list, the template decl specializations
table, and the template type specializations table.

This patch fixes PPH test x1tmplclass.

In the process, I mace lto_output_location and lto_input_location extern in
lto-streamer* and factored cp_debug_parser_tokens out of cp_debug_parser in
cp/parser.c.

Tested on x64.


Index: gcc/testsuite/ChangeLog.pph

2011-07-25  Lawrence Crowl  

* g++.dg/pph/x1tmplclass.cc: Remove expected failure.

Index: gcc/cp/ChangeLog.pph

2011-07-25   Lawrence Crowl  

* pph.c (pph_dump_tree_name): Make extern.
* pph.h (pph_dump_tree_name): Make extern.
* pph-streamer-in.c (pph_read_file_contents): Load pending templates
list and tables of decl and type template specializations.
* pph-streamer.c (enum pph_trace_type): Add PPH_TRACE_LOCATION
enumeration.
(pph_trace): Add trace for locations.
(pph_trace_location): New.
* pph-streamer.h (pph_trace_location): New.
(pph_out_pending_templates_list): New.
(pph_in_pending_templates_list): New.
(pph_out_spec_entry_tables): New.
(pph_in_spec_entry_tables): New.
(pph_out_location): New.
(pph_in_location): New.
* pph-streamer-out.c (pph_write_file_contents): Save template pending
instantiation list, decl specialization table, and type specialization
table.
* pt.c (pph_out_tinst_level): New.
(pph_dump_tinst_level): New.
(pph_in_tinst_level): New.
(pph_dump_pending_templates_list): New.
(pph_out_pending_templates_list): New.
(pph_in_pending_templates_list): New.
(pph_out_spec_entry_slot): New.
(pph_out_spec_entry_htab): New.
(pph_dump_spec_entry_slot): New.
(pph_dump_spec_entry_htab): New.
(pph_in_spec_entry_htab): New.
(pph_out_spec_entry_tables): New.
(pph_in_spec_entry_tables): New.
* parser.c (cp_debug_parser): Factor out printing of token list.
(cp_debug_parser_tokens): New.

Index: gcc/ChangeLog.pph

2011-07-25  Lawrence Crowl  

* lto-streamer-out.c (lto_output_location): Make external.
* lto-streamer-in.c (lto_input_location): Make external.
* lto-streamer.h (lto_output_location): Make external.
(lto_input_location): Make external.


Index: gcc/testsuite/g++.dg/pph/x1tmplclass.cc
===
--- gcc/testsuite/g++.dg/pph/x1tmplclass.cc (revision 176778)
+++ gcc/testsuite/g++.dg/pph/x1tmplclass.cc (working copy)
@@ -1,5 +1,3 @@
-// { dg-xfail-if "BOGUS" { "*-*-*" } { "-fpph-map=pph.map" } }
-// { dg-bogus "x0tmplclass.h:14:5: error: specializing member 
.wrapper::cache. requires .template<>. syntax" "" { xfail *-*-* } 0 }
 
 #include "x0tmplclass.h"
 
Index: gcc/cp/pph.c
===
--- gcc/cp/pph.c(revision 176778)
+++ gcc/cp/pph.c(working copy)
@@ -46,7 +46,7 @@ FILE *pph_logfile = NULL;
 /* Dump a complicated name for tree T to FILE using FLAGS.
See TDF_* in tree-pass.h for flags.  */
 
-static void
+void
 pph_dump_tree_name (FILE *file, tree t, int flags)
 {
   enum tree_code code = TREE_CODE (t);
Index: gcc/cp/pph.h
===
--- gcc/cp/pph.h(revision 176778)
+++ gcc/cp/pph.h(working copy)
@@ -52,6 +52,7 @@ extern FILE *pph_logfile;
 /* In pph.c  */
 extern void pph_init (void);
 extern void pph_finish (void);
+extern void pph_dump_tree_name (FILE *file, tree t, int flags);
 extern void pph_dump_namespace (FILE *, tree ns);
 
 #endif  /* GCC_CP_PPH_H  */
Index: gcc/cp/pph-streamer-in.c
===
--- gcc/cp/pph-streamer-in.c(revision 176778)
+++ gcc/cp/pph-streamer-in.c(working copy)
@@ -1448,6 +1448,9 @@ pph_read_file_contents (pph_stream *stre
   FOR_EACH_VEC_ELT (tree, file_unemitted_tinfo_decls, i, t)
 VEC_safe_push (tree, gc, unemitted_tinfo_decls, t);
 
+  pph_in_pending_templates_list (stream);
+  pph_in_spec_entry_tables (stream);
+
   file_static_aggregates = pph_in_tree (stream);
   static_aggregates = chainon (file_static_aggregates, static_aggregates);
 
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c (revision 176778)
+++ gcc/cp/pt.c (working copy)
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.  
 #include "timevar.h"
 #include "tree-iterator.h"
 #include "vecprim.h"
+#include "pph.h"
+#include "pph-streamer.h"
 
 /* The type of functions taking a tree, and some additional data, and
returning an int.  */
@@ -19681,4 +19683,253 @@ print_template_statistics (void)
   htab_collisions (type_specializations));
 }
 
+
+/* PPH write/read */
+
+
+/* Emit a tinst_level li

Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-26 Thread Sebastian Pop
On Sat, Jul 23, 2011 at 04:59, Richard Guenther
 wrote:
> On Sat, Jul 23, 2011 at 1:01 AM, Sebastian Pop  wrote:
>> The CLAST produced by CLooG-ISL contains an assignment and GCC chokes
>> on it.  The exact CLAST contains an assignment followed by an if:
>>
>> scat_1 = max(0,ceild(T_4-7,8));
>> if (scat_1 <= min(1,floord(T_4-1,8))) {
>>  S7(scat_1);
>> }
>>
>> This is equivalent to a loop that iterates only once, and so CLooG
>> generates an assignment followed by an if instead of a loop.  This is
>> an important optimization that was improved in ISL, that allows
>> if-conversion: imagine GCC having to figure out that a loop like the
>> following actually iterates only once, and can be converted to an if:
>>
>> for (scat_1 = max(0,ceild(T_4-7,8)); scat_1 <= min(1,floord(T_4-1,8)); 
>> scat_1++)
>>  S7(scat_1);
>>
>> This patch implements the translation of CLAST assignments.
>> Bootstrapped and tested on amd64-linux.
>
> Ok if Tobias is fine with it.

Tobias, could you please have a look at this patch?

Thanks,
Sebastian

>
> Thanks,
> Richard.
>
>> Sebastian
>>
>> 2011-07-22  Sebastian Pop  
>>
>>        PR middle-end/48648
>>        * graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
>>        CLAST assignments.
>>        (translate_clast): Same.
>>        (translate_clast_assignment): New.
>>
>>        * gcc.dg/graphite/id-pr48648.c: New.
>> ---
>>  gcc/ChangeLog                              |    8 
>>  gcc/graphite-clast-to-gimple.c             |   49 
>> 
>>  gcc/testsuite/ChangeLog                    |    5 +++
>>  gcc/testsuite/gcc.dg/graphite/id-pr48648.c |   21 
>>  4 files changed, 83 insertions(+), 0 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/graphite/id-pr48648.c
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 9cfa21b..303c9c9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,11 @@
>> +2011-07-22  Sebastian Pop  
>> +
>> +       PR middle-end/48648
>> +       * graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
>> +       CLAST assignments.
>> +       (translate_clast): Same.
>> +       (translate_clast_assignment): New.
>> +
>>  2011-07-21  Sebastian Pop  
>>
>>        PR middle-end/47654
>> diff --git a/gcc/graphite-clast-to-gimple.c b/gcc/graphite-clast-to-gimple.c
>> index ddf6d3d..a4668d3 100644
>> --- a/gcc/graphite-clast-to-gimple.c
>> +++ b/gcc/graphite-clast-to-gimple.c
>> @@ -812,6 +812,9 @@ clast_get_body_of_loop (struct clast_stmt *stmt)
>>   if (CLAST_STMT_IS_A (stmt, stmt_block))
>>     return clast_get_body_of_loop (((struct clast_block *) stmt)->body);
>>
>> +  if (CLAST_STMT_IS_A (stmt, stmt_ass))
>> +    return clast_get_body_of_loop (stmt->next);
>> +
>>   gcc_unreachable ();
>>  }
>>
>> @@ -1121,6 +1124,48 @@ translate_clast_for (loop_p context_loop, struct 
>> clast_for *stmt, edge next_e,
>>   return last_e;
>>  }
>>
>> +/* Translates a clast assignment STMT to gimple.
>> +
>> +   - NEXT_E is the edge where new generated code should be attached.
>> +   - BB_PBB_MAPPING is is a basic_block and it's related poly_bb_p mapping. 
>>  */
>> +
>> +static edge
>> +translate_clast_assignment (struct clast_assignment *stmt, edge next_e,
>> +                           int level, ivs_params_p ip)
>> +{
>> +  gimple_seq stmts;
>> +  mpz_t v1, v2;
>> +  tree type, new_name, var;
>> +  edge res = single_succ_edge (split_edge (next_e));
>> +  struct clast_expr *expr = (struct clast_expr *) stmt->RHS;
>> +  struct clast_user_stmt *body
>> +    = clast_get_body_of_loop ((struct clast_stmt *) stmt);
>> +  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);
>> +
>> +  mpz_init (v1);
>> +  mpz_init (v2);
>> +  type = type_for_clast_expr (expr, ip, v1, v2);
>> +  var = create_tmp_var (type, "graphite_var");
>> +  new_name = force_gimple_operand (clast_to_gcc_expression (type, expr, ip),
>> +                                  &stmts, true, var);
>> +  add_referenced_var (var);
>> +  if (stmts)
>> +    {
>> +      gsi_insert_seq_on_edge (next_e, stmts);
>> +      gsi_commit_edge_inserts ();
>> +    }
>> +
>> +  compute_bounds_for_level (pbb, level, v1, v2);
>> +  save_clast_name_index (ip->newivs_index, stmt->LHS,
>> +                        VEC_length (tree, *(ip->newivs)), level, v1, v2);
>> +  VEC_safe_push (tree, heap, *(ip->newivs), new_name);
>> +
>> +  mpz_clear (v1);
>> +  mpz_clear (v2);
>> +
>> +  return res;
>> +}
>> +
>>  /* Translates a clast guard statement STMT to gimple.
>>
>>    - NEXT_E is the edge where new generated code should be attached.
>> @@ -1171,6 +1216,10 @@ translate_clast (loop_p context_loop, struct 
>> clast_stmt *stmt, edge next_e,
>>   else if (CLAST_STMT_IS_A (stmt, stmt_block))
>>     next_e = translate_clast (context_loop, ((struct clast_block *) 
>> stmt)->body,
>>                              next_e, bb_pbb_mapping, level, ip);
>> +
>> +  else if (CLAST_STMT_IS_A (stmt, stmt_ass))
>> +    next_e = translate_clast_assignment ((struct clast_ass

Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-26 Thread Sebastian Pop
On Thu, Jul 21, 2011 at 18:45, Sebastian Pop  wrote:
> Hi Tobias,
>
> On Thu, Jul 21, 2011 at 18:00, Tobias Grosser  wrote:
>> Hi,
>>
>> I propose to switch to the official cloog.org cloog version with isl backend 
>> and
>> at the same time to remove support for both CLooG-PPL legacy as well as
>> CLooG-Parma.
>>
>
> Many thanks for implementing this cleanup.
>
>> We want to switch to cloog-isl as it is the only officially maintained 
>> version
>> of cloog. Furthermore, it provides features that will help to fix some bugs 
>> in
>> the graphite code generation[1].
>> The reason to abond CLooG-PPL (legacy version) is, that cloog-isl provides 
>> the
>> new CloogInput library interface. This interface is not available the old 
>> CLooG.
>> I plan to move graphite to this interface. As I do not see enough benefits 
>> from
>> being able to use CLooG PPL, I decided to not introduce any compatibility
>> scheme, but just remove any code that is only needed for CLooG-PPL.
>> I also removed CLooG-Parma (cloog.org with PPL backend), as it is currently 
>> not
>> actively maintained and not well tested. I believe our time is better spent 
>> on
>> improving graphite or cloog isl, as in putting time into this cloog version.
>>
>> So here we are: Moving graphite back to the official cloog.org version!
>>
>> Passes 'make check RUNTESTFLAGS=graphite.exp' as well as a bootstrap on Linux
>> amd64.
>>
>> Cheers
>> Tobi
>>
>> P.S.: Why do we move to the super latest one. Because we expect that most 
>> users
>> would need an update, and, as we will soon use some of the newer features, 
>> there
>> is no need to force another update later.
>>
>>
>> Tobias Grosser (3):
>>  Make CLooG isl the only supported CLooG version.
>>  Require cloog 0.16.3
>>  Remove code that supported legacy CLooG.
>
> For all your changes, you would need the ok from a configure maintainer.

Ping maintainers of the "build machinery (*.in)".

Thanks,
Sebastian

> The changes to the graphite framework are ok.
>
> Thanks,
> Sebastian
>


Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-26 Thread Sebastian Pop
On Fri, Jul 22, 2011 at 07:32, Joseph S. Myers  wrote:
> On Fri, 22 Jul 2011, Tobias Grosser wrote:
>
>> I propose to switch to the official cloog.org cloog version with isl backend 
>> and
>> at the same time to remove support for both CLooG-PPL legacy as well as
>> CLooG-Parma.
>
> Where are the install.texi changes in this patch series?

Please see the attached patch.

Thanks,
Sebastian
From 079ec12ce018ad6e7577d2c069b1c612b3b2b98e Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Tue, 26 Jul 2011 13:28:36 -0500
Subject: [PATCH] Document CLooG-ISL requirement for Graphite

2011-07-26  Sebastian Pop  

	* doc/invoke.texi: Document CLooG-ISL requirement for Graphite.
---
 gcc/ChangeLog|4 
 gcc/doc/install.texi |   24 
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 17358a8..78fcf59 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2011-07-26  Sebastian Pop  
+
+	* doc/invoke.texi: Document CLooG-ISL requirement for Graphite.
+
 2011-07-23  Sebastian Pop  
 
 	PR tree-optimization/49471
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 9b1b037..368221f 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -365,26 +365,18 @@ distribution is found in a subdirectory of your GCC sources named
 Necessary to build GCC with the Graphite loop optimizations.
 It can be downloaded from @uref{http://www.cs.unipr.it/ppl/Download/}.
 
-The @option{--with-ppl} configure option should be used if PPL is not
+The configure option @option{--with-ppl} should be used if PPL is not
 installed in your default library search path.
 
-@item CLooG-PPL version 0.15 or CLooG 0.16
+@item CLooG-ISL 0.16
 
-Necessary to build GCC with the Graphite loop optimizations.  There
-are two versions available.  CLooG-PPL 0.15 as well as CLooG 0.16.
-The former is the default right now.  It can be downloaded from
-@uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as
-@file{cloog-ppl-0.15.tar.gz}.
+Necessary to build GCC with the Graphite loop optimizations.  It is
+available from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as
+@file{cloog-0.16.3.tar.gz}.  Even if CLooG 0.16 does not use PPL, PPL
+is still required for Graphite.
 
-CLooG 0.16 support is still in testing stage, but will be the
-default in future GCC releases.  It is also available at
-@uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as
-@file{cloog-0.16.1.tar.gz}.  To use it add the additional configure
-option @option{--enable-cloog-backend=isl}.  Even if CLooG 0.16
-does not use PPL, PPL is still required for Graphite.
-
-In both cases @option{--with-cloog} configure option should be used
-if CLooG is not installed in your default library search path.
+The configure option @option{--with-cloog} should be used if CLooG is
+not installed in your default library search path.
 
 @end table
 
-- 
1.7.4.1



Re: PATCH RFA: Correct toplevel configury

2011-07-26 Thread Paolo Bonzini

On 07/26/2011 08:00 PM, Ian Lance Taylor wrote:

Ping.

On Thu, Jul 21, 2011 at 11:20 PM, Ian Lance Taylor  wrote:

One of my recent patches broke the toplevel configury.  I moved a test
of $configdirs to a point before nonexistent directories have been
removed from configdirs.  The test was for whether gcc is being
configured.  The test is fine in the gcc repository, but not in the src
repository.

This patch fixes the problem.  If the gcc directory exists, we assume
that we are going to build it.  This only matters for setting the
default value for --enable-bootstrap.

Bootstrapped on x86_64-unknown-linux-gnu.  OK for mainline?

Ian


2011-07-21  Ian Lance Taylor

* configure.ac: Set have_compiler based on whether gcc directory
exists, rather than on whether gcc is in configdirs.





Ok.

Paolo


Re: Patch committed: Fix demangler crash

2011-07-26 Thread Ian Lance Taylor
"H.J. Lu"  writes:

> I checked in this as an obvious fix.

Thanks.  I wonder why I didn't see that.

Ian


Re: [C++0x] contiguous bitfields race implementation

2011-07-26 Thread Aldy Hernandez



+ bitnum -= bitregion_start;
+ bitregion_end -= bitregion_start;
+ bitregion_start = 0;


Why is this necessary/useful?


You mean, why am I resetting these values (because the call to 
get_best_mode() following it needs the adjusted values).  Or why am I 
adjusting the address to point to the beginning of the region?


A


Re: [testsuite] Provide and use mmap effective-target keyword

2011-07-26 Thread Rainer Orth
Ulrich,

> Rainer Orth wrote:
>
>> +proc check_effective_target_mmap {} {
>> +return [check_no_compiler_messages mmap assembly {
>> +#include 
>> +}]
>> +}
>
> Unfortunately, this test breaks spu-elf; a lot of tests now fail with:
> loop-2f.c:(.text+0x100): undefined reference to `mmap'
>
> The problem is that we *do* have sys/mman.h on SPU, it just does not
> contain a mmap prototype (and there is no mmap routine).  Instead,
> sys/mman.h on SPU has prototypes and defines for "mmap_eaddr" and
> the like, which allow SPU programs to manipulate PPU-side memory
> mappings (to be accessed via DMA) ...
>
> Can the test be extended to actually check for the mmap routine
> itself?

sure, that's one of the reasons to centralize the mmap test.  Simply
replacing the body of the proc with

return [check_function_available "mmap"]

should work.  Could you give it a try?

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [C++ Patch] PR 49776

2011-07-26 Thread Jason Merrill

OK.

Jason


Re: [patch] Fix PR tree-optimization/49471

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 08:30, Richard Guenther
 wrote:
> I suppose we also need to allow POINTER_TYPE_P here (but then
> treat it like an unsigned variable of the same width).

Updated patch.  Ok for trunk after regstrap?

Thanks,
Sebastian
From 3e8f8cfd0c4298b6b5e88c8bc7ba81a01e7cd815 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Sun, 24 Jul 2011 01:52:52 -0500
Subject: [PATCH] Fix PR49471: canonicalize_loop_ivs should not generate unsigned types.

2011-07-23  Sebastian Pop  

	PR tree-optimization/49471
	* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Build an unsigned
	iv only when the largest type is unsigned.  Do not call
	lang_hooks.types.type_for_size.

	* testsuite/libgomp.graphite/force-parallel-1.c: Un-xfail.
	* testsuite/libgomp.graphite/force-parallel-2.c: Adjust pattern.
---
 gcc/ChangeLog  |7 +++
 gcc/tree-ssa-loop-manip.c  |   19 ---
 libgomp/ChangeLog  |5 +
 .../testsuite/libgomp.graphite/force-parallel-1.c  |2 +-
 .../testsuite/libgomp.graphite/force-parallel-2.c  |2 +-
 5 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 27d4001..17358a8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2011-07-23  Sebastian Pop  
 
+	PR tree-optimization/49471
+	* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Build an unsigned
+	iv only when the largest type is unsigned.  Do not call
+	lang_hooks.types.type_for_size.
+
+2011-07-23  Sebastian Pop  
+
 	PR middle-end/47691
 	* graphite-clast-to-gimple.c (translate_clast_user): Update use of
 	copy_bb_and_scalar_dependences.
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 8176ed8..f3392e6 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -1200,18 +1200,31 @@ canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch)
   gimple stmt;
   edge exit = single_dom_exit (loop);
   gimple_seq stmts;
+  enum machine_mode mode;
+  bool unsigned_p = false;
 
   for (psi = gsi_start_phis (loop->header);
!gsi_end_p (psi); gsi_next (&psi))
 {
   gimple phi = gsi_stmt (psi);
   tree res = PHI_RESULT (phi);
+  bool uns;
 
-  if (is_gimple_reg (res) && TYPE_PRECISION (TREE_TYPE (res)) > precision)
-	precision = TYPE_PRECISION (TREE_TYPE (res));
+  type = TREE_TYPE (res);
+  if (!is_gimple_reg (res)
+	  || (!INTEGRAL_TYPE_P (type)
+	  && !POINTER_TYPE_P (type))
+	  || TYPE_PRECISION (type) < precision)
+	continue;
+
+  uns = POINTER_TYPE_P (type) | TYPE_UNSIGNED (type);
+  unsigned_p = TYPE_PRECISION (type) > precision ? uns : unsigned_p | uns;
+  precision = TYPE_PRECISION (type);
 }
 
-  type = lang_hooks.types.type_for_size (precision, 1);
+  mode = smallest_mode_for_size (precision, MODE_INT);
+  precision = GET_MODE_PRECISION (mode);
+  type = build_nonstandard_integer_type (precision, unsigned_p);
 
   if (original_precision != precision)
 {
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 9225401..d5cd94d 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,8 @@
+2011-07-23  Sebastian Pop  
+
+	* testsuite/libgomp.graphite/force-parallel-1.c: Un-xfail.
+	* testsuite/libgomp.graphite/force-parallel-2.c: Adjust pattern.
+
 2011-07-18  Rainer Orth  
 
 	PR target/49541
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-1.c b/libgomp/testsuite/libgomp.graphite/force-parallel-1.c
index 71ed332..7f043d8 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-1.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-1.c
@@ -23,7 +23,7 @@ int main(void)
 }
 
 /* Check that parallel code generation part make the right answer.  */
-/* { dg-final { scan-tree-dump-times "1 loops carried no dependency" 2 "graphite" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "1 loops carried no dependency" 2 "graphite" } } */
 /* { dg-final { cleanup-tree-dump "graphite" } } */
 /* { dg-final { scan-tree-dump-times "loopfn" 5 "optimized" } } */
 /* { dg-final { cleanup-tree-dump "parloops" } } */
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-2.c b/libgomp/testsuite/libgomp.graphite/force-parallel-2.c
index 1ce0feb..03d8236 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-2.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-2.c
@@ -23,7 +23,7 @@ int main(void)
 }
 
 /* Check that parallel code generation part make the right answer.  */
-/* { dg-final { scan-tree-dump-times "2 loops carried no dependency" 1 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "2 loops carried no dependency" 2 "graphite" } } */
 /* { dg-final { cleanup-tree-dump "graphite" } } */
 /* { dg-final { scan-tree-dump-times "loopfn" 5 "optimized" } } */
 /* { dg-final { cleanup-tree-dump "parloops" } } */
-- 
1.7.4.1



Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 7:50 PM, H.J. Lu  wrote:

> This patch fixes PIC with external symbol and updates
> x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
> for x32.

> 2011-07-26  H.J. Lu  
>
>        PR target/49853
>        * config/i386/i386.c (ix86_expand_move): Call convert_to_mode
>        on legitimize_tls_address return if needed.  Allow ptr_mode for
>        symbolic operand with PIC.

 Eh... half of your patch is just an unnecessary rename of a temporary
 variable. See attached patch for a cleaned-up version.
>>>
>>> It looks good to me.  Can you check it in?
>>
>> Please, can you test it on x32 first? I will commit it after
>> bootstrap/regtest finish.
>>
>
> It may need other changes for TLS support.  I can update it
> after your change is checked in.

Committed with following ChangeLog:

2011-07-26  Uros Bizjak  
H.J. Lu  

PR target/47369
PR target/49853
* config/i386/i386.c (ix86_expand_move): Call convert_to_mode
if legitimize_tls_address returned operand in wrong mode. Allow
SImode and DImode symbolic operand for PIC.  Call convert_to_mode
if legitimize_pic_address returned operand in wrong mode.

Tested on x86_64-pc-linux-gnu {,-m32}.

Uros.


Re: PATCH RFA: Correct toplevel configury

2011-07-26 Thread Ian Lance Taylor
Ping.

On Thu, Jul 21, 2011 at 11:20 PM, Ian Lance Taylor  wrote:
> One of my recent patches broke the toplevel configury.  I moved a test
> of $configdirs to a point before nonexistent directories have been
> removed from configdirs.  The test was for whether gcc is being
> configured.  The test is fine in the gcc repository, but not in the src
> repository.
>
> This patch fixes the problem.  If the gcc directory exists, we assume
> that we are going to build it.  This only matters for setting the
> default value for --enable-bootstrap.
>
> Bootstrapped on x86_64-unknown-linux-gnu.  OK for mainline?
>
> Ian
>
>
> 2011-07-21  Ian Lance Taylor  
>
>        * configure.ac: Set have_compiler based on whether gcc directory
>        exists, rather than on whether gcc is in configdirs.
>
>
>
Index: configure.ac
===
--- configure.ac	(revision 176515)
+++ configure.ac	(working copy)
@@ -1139,10 +1139,11 @@ AC_ARG_ENABLE([bootstrap],
 enable_bootstrap=default)
 
 # Issue errors and warnings for invalid/strange bootstrap combinations.
-case "$configdirs" in
-  *gcc*) have_compiler=yes ;;
-  *) have_compiler=no ;;
-esac
+if test -r $srcdir/gcc/configure; then
+  have_compiler=yes
+else
+  have_compiler=no ;;
+fi
 
 case "$have_compiler:$host:$target:$enable_bootstrap" in
   *:*:*:no) ;;


Re: [testsuite] Provide and use mmap effective-target keyword

2011-07-26 Thread Ulrich Weigand
Rainer Orth wrote:

> +proc check_effective_target_mmap {} {
> +return [check_no_compiler_messages mmap assembly {
> + #include 
> +}]
> +}

Unfortunately, this test breaks spu-elf; a lot of tests now fail with:
loop-2f.c:(.text+0x100): undefined reference to `mmap'

The problem is that we *do* have sys/mman.h on SPU, it just does not
contain a mmap prototype (and there is no mmap routine).  Instead,
sys/mman.h on SPU has prototypes and defines for "mmap_eaddr" and
the like, which allow SPU programs to manipulate PPU-side memory
mappings (to be accessed via DMA) ...

Can the test be extended to actually check for the mmap routine
itself?

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 10:56, Sebastian Pop  wrote:
> On Tue, Jul 26, 2011 at 10:43, Richard Guenther  wrote:
>> Please make graphite more robust instead.
>
> Ok, in this case, what about setting gloog_error and stopping the code
> generation instead of failing on this gcc_assert.
>

Here is the patch.  Ok for trunk after regstrap?

Thanks,
Sebastian
From 844201ee77000c40e7f842d066715217d3a95eac Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Sat, 23 Jul 2011 23:29:30 -0500
Subject: [PATCH] Fix PR47691: do not abort compilation when code generation fails

2011-07-23  Sebastian Pop  

	PR middle-end/47691
	* graphite-clast-to-gimple.c (translate_clast_user): Update use of
	copy_bb_and_scalar_dependences.
	* sese.c (rename_uses): Do not call gcc_assert.  Set gloog_error.
	(graphite_copy_stmts_from_block): Update call to rename_uses.
	(copy_bb_and_scalar_dependences): Update call to
	graphite_copy_stmts_from_block.
	* sese.h (copy_bb_and_scalar_dependences): Update declaration.

	* gfortran.dg/graphite/id-pr47691.f: New.
---
 gcc/ChangeLog   |   11 ++
 gcc/graphite-clast-to-gimple.c  |2 +-
 gcc/sese.c  |   41 --
 gcc/sese.h  |2 +-
 gcc/testsuite/ChangeLog |5 +++
 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f |7 
 6 files changed, 55 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f3de4df..4c49a62 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,16 @@
 2011-07-23  Sebastian Pop  
 
+	PR middle-end/47691
+	* graphite-clast-to-gimple.c (translate_clast_user): Update use of
+	copy_bb_and_scalar_dependences.
+	* sese.c (rename_uses): Do not call gcc_assert.  Set gloog_error.
+	(graphite_copy_stmts_from_block): Update call to rename_uses.
+	(copy_bb_and_scalar_dependences): Update call to
+	graphite_copy_stmts_from_block.
+	* sese.h (copy_bb_and_scalar_dependences): Update declaration.
+
+2011-07-23  Sebastian Pop  
+
 	PR middle-end/47594
 	* graphite-sese-to-poly.c (scan_tree_for_params_right_scev): Sign
 	extend constants.
diff --git a/gcc/graphite-clast-to-gimple.c b/gcc/graphite-clast-to-gimple.c
index a4668d3..bbfd847 100644
--- a/gcc/graphite-clast-to-gimple.c
+++ b/gcc/graphite-clast-to-gimple.c
@@ -1019,7 +1019,7 @@ translate_clast_user (struct clast_user_stmt *stmt, edge next_e,
 
   build_iv_mapping (iv_map, stmt, ip);
   next_e = copy_bb_and_scalar_dependences (GBB_BB (gbb), ip->region,
-	   next_e, iv_map);
+	   next_e, iv_map, &gloog_error);
   VEC_free (tree, heap, iv_map);
 
   new_bb = next_e->src;
diff --git a/gcc/sese.c b/gcc/sese.c
index a03cbc9..ec96dfb 100644
--- a/gcc/sese.c
+++ b/gcc/sese.c
@@ -458,11 +458,13 @@ set_rename (htab_t rename_map, tree old_name, tree expr)
substitution map RENAME_MAP, inserting the gimplification code at
GSI_TGT, for the translation REGION, with the original copied
statement in LOOP, and using the induction variable renaming map
-   IV_MAP.  Returns true when something has been renamed.  */
+   IV_MAP.  Returns true when something has been renamed.  GLOOG_ERROR
+   is set when the code generation cannot continue.  */
 
 static bool
 rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
-	 sese region, loop_p loop, VEC (tree, heap) *iv_map)
+	 sese region, loop_p loop, VEC (tree, heap) *iv_map,
+	 bool *gloog_error)
 {
   use_operand_p use_p;
   ssa_op_iter op_iter;
@@ -522,7 +524,11 @@ rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
 	 scalar SSA_NAME used in the scop: all the other scalar
 	 SSA_NAMEs should have been translated out of SSA using
 	 arrays with one element.  */
-  gcc_assert (!chrec_contains_undetermined (scev));
+  if (chrec_contains_undetermined (scev))
+	{
+	  *gloog_error = true;
+	  return false;
+	}
 
   new_expr = chrec_apply_map (scev, iv_map);
 
@@ -530,8 +536,12 @@ rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
 	 the uses of the new induction variables.  We should be
 	 able to use new_expr instead of the old_name in the newly
 	 generated loop nest.  */
-  gcc_assert (!chrec_contains_undetermined (new_expr)
-		  && !tree_contains_chrecs (new_expr, NULL));
+  if (chrec_contains_undetermined (new_expr)
+	  || tree_contains_chrecs (new_expr, NULL))
+	{
+	  *gloog_error = true;
+	  return false;
+	}
 
   /* Replace the old_name with the new_expr.  */
   new_expr = force_gimple_operand (unshare_expr (new_expr), &stmts,
@@ -555,12 +565,14 @@ rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
 }
 
 /* Duplicates the statements of basic block BB into basic block NEW_BB
-   and compute the new induction variables according to the IV_MAP.  */
+   and compute the new induction variable

Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 10:36 AM, Uros Bizjak  wrote:
> On Tue, Jul 26, 2011 at 7:31 PM, H.J. Lu  wrote:
>
 This patch fixes PIC with external symbol and updates
 x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
 for x32.
>>>
 2011-07-26  H.J. Lu  

        PR target/49853
        * config/i386/i386.c (ix86_expand_move): Call convert_to_mode
        on legitimize_tls_address return if needed.  Allow ptr_mode for
        symbolic operand with PIC.
>>>
>>> Eh... half of your patch is just an unnecessary rename of a temporary
>>> variable. See attached patch for a cleaned-up version.
>>
>> It looks good to me.  Can you check it in?
>
> Please, can you test it on x32 first? I will commit it after
> bootstrap/regtest finish.
>

It may need other changes for TLS support.  I can update it
after your change is checked in.


-- 
H.J.


Re: [C++0x] contiguous bitfields race implementation

2011-07-26 Thread Jason Merrill

On 07/26/2011 10:32 AM, Aldy Hernandez wrote:



I think the adjustment above is intended to match the adjustment of the
address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
that bitregion_start%BITS_PER_UNIT == 0.


That was intentional. bitregion_start always falls on a byte boundary,
does it not?


Ah, yes, of course, it's bitnum that might not.  The code changes look 
good, then.


Jason


Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 7:31 PM, H.J. Lu  wrote:

>>> This patch fixes PIC with external symbol and updates
>>> x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
>>> for x32.
>>
>>> 2011-07-26  H.J. Lu  
>>>
>>>        PR target/49853
>>>        * config/i386/i386.c (ix86_expand_move): Call convert_to_mode
>>>        on legitimize_tls_address return if needed.  Allow ptr_mode for
>>>        symbolic operand with PIC.
>>
>> Eh... half of your patch is just an unnecessary rename of a temporary
>> variable. See attached patch for a cleaned-up version.
>
> It looks good to me.  Can you check it in?

Please, can you test it on x32 first? I will commit it after
bootstrap/regtest finish.

Thanks,
Uros.


Re: [C++0x] contiguous bitfields race implementation

2011-07-26 Thread Jason Merrill

On 07/26/2011 09:36 AM, Aldy Hernandez wrote:



+ bitnum -= bitregion_start;
+ bitregion_end -= bitregion_start;
+ bitregion_start = 0;


Why is this necessary/useful?


You mean, why am I resetting these values (because the call to
get_best_mode() following it needs the adjusted values). Or why am I
adjusting the address to point to the beginning of the region?


I think the adjustment above is intended to match the adjustment of the 
address by bitregion_start/BITS_PER_UNIT, but the above seems to assume 
that bitregion_start%BITS_PER_UNIT == 0.


Jason


Re: [C++0x] contiguous bitfields race implementation

2011-07-26 Thread Aldy Hernandez



I think the adjustment above is intended to match the adjustment of the
address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
that bitregion_start%BITS_PER_UNIT == 0.


That was intentional.  bitregion_start always falls on a byte boundary, 
does it not?


struct {
stuff;
unsigned int b:3;
unsigned int other_bits:22;
other_stuff;
}

Does not "b" always start at a byte boundary?


Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 10:26 AM, Uros Bizjak  wrote:
> On Tue, Jul 26, 2011 at 4:59 PM, H.J. Lu  wrote:
>
>> This patch fixes PIC with external symbol and updates
>> x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
>> for x32.
>
>> 2011-07-26  H.J. Lu  
>>
>>        PR target/49853
>>        * config/i386/i386.c (ix86_expand_move): Call convert_to_mode
>>        on legitimize_tls_address return if needed.  Allow ptr_mode for
>>        symbolic operand with PIC.
>
> Eh... half of your patch is just an unnecessary rename of a temporary
> variable. See attached patch for a cleaned-up version.

It looks good to me.  Can you check it in?

> Also, please use explicit DImode and SImode checks to match what
> ix86_legitimate_address_p does.
>
>>        * config/i386/predicates.md (x86_64_immediate_operand): Always
>>        allow the offsetted memory references for TARGET_X32.
>>        (x86_64_zext_immediate_operand): Likewise.
>>        (x86_64_movabs_operand): Don't allow nonmemory_operand for
>>        TARGET_X32.
>
> Why? It is certainly not needed for -fPIC. Please provide a separate
> patch and testcase for predicates.md change.
>

I will submit them separately with some testcases after your patch
is checked in.

Thanks.

-- 
H.J.


Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-26 Thread Richard Henderson
On 07/26/2011 10:26 AM, Georg-Johann Lay wrote:
> If -mint8 (word_mode = QImode) ever returns resp. is turned
> functional again, then the QI version is undefined for
> offsets >= 8  whereas the HI version is only undefined for
> offsets >= 16.

It's undefined at the C level, not necessarily at the rtl level.


r~


Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-26 Thread Georg-Johann Lay
Richard Henderson wrote:
> On 07/26/2011 02:48 AM, Georg-Johann Lay wrote:
>> Moreover, the original peep2 is not fully correct because it
>> maps a 16-bit shift to a 8-bit one.  The correct mapping is
>>
>> (set (match_dup 2)
>>  (subreg:QI (ashift:HI (zero_extend:HI (match_dup 2))
>>(match_dup 1))
>> 0))
>>
>> instead of
>>
>> (set (match_dup 2)
>>  (ashift:QI (match_dup 2)
>> (match_dup 1)))
>>
>> I don't think it makes a difference that late in the
>> compilation process, yet I prefer correct semantics.
> 
> Why do you think the semantics were wrong?  As long as you
> don't define SHIFT_COUNT_TRUNCATES, these are equivalent.
> 
> r~

If -mint8 (word_mode = QImode) ever returns resp. is turned
functional again, then the QI version is undefined for
offsets >= 8  whereas the HI version is only undefined for
offsets >= 16.




Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 4:59 PM, H.J. Lu  wrote:

> This patch fixes PIC with external symbol and updates
> x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
> for x32.

> 2011-07-26  H.J. Lu  
>
>        PR target/49853
>        * config/i386/i386.c (ix86_expand_move): Call convert_to_mode
>        on legitimize_tls_address return if needed.  Allow ptr_mode for
>        symbolic operand with PIC.

Eh... half of your patch is just an unnecessary rename of a temporary
variable. See attached patch for a cleaned-up version.
Also, please use explicit DImode and SImode checks to match what
ix86_legitimate_address_p does.

>        * config/i386/predicates.md (x86_64_immediate_operand): Always
>        allow the offsetted memory references for TARGET_X32.
>        (x86_64_zext_immediate_operand): Likewise.
>        (x86_64_movabs_operand): Don't allow nonmemory_operand for
>        TARGET_X32.

Why? It is certainly not needed for -fPIC. Please provide a separate
patch and testcase for predicates.md change.

Uros.
Index: i386.c
===
--- i386.c  (revision 176794)
+++ i386.c  (working copy)
@@ -15028,11 +15028,14 @@ ix86_expand_move (enum machine_mode mode
 op0, 1, OPTAB_DIRECT);
  if (tmp == op0)
return;
+ if (GET_MODE (tmp) != mode)
+   op1 = convert_to_mode (mode, tmp, 1);
}
 }
 
   if ((flag_pic || MACHOPIC_INDIRECT) 
-   && mode == Pmode && symbolic_operand (op1, Pmode))
+  && (mode == SImode || mode == DImode)
+  && symbolic_operand (op1, mode))
 {
   if (TARGET_MACHO && !TARGET_64BIT)
{
@@ -15073,13 +15076,15 @@ ix86_expand_move (enum machine_mode mode
   else
{
  if (MEM_P (op0))
-   op1 = force_reg (Pmode, op1);
- else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, Pmode))
+   op1 = force_reg (mode, op1);
+ else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, mode))
{
  rtx reg = can_create_pseudo_p () ? NULL_RTX : op0;
  op1 = legitimize_pic_address (op1, reg);
  if (op0 == op1)
return;
+ if (GET_MODE (op1) != mode)
+   op1 = convert_to_mode (mode, op1, 1);
}
}
 }


Re: [PATCH, PR 49094] Refrain from creating misaligned accesses in SRA

2011-07-26 Thread Martin Jambor
On Tue, Jul 26, 2011 at 09:39:02AM +0200, Richard Guenther wrote:
> On Mon, Jul 25, 2011 at 7:52 PM, Martin Jambor  wrote:

...

> >
> > 2011-07-25  Martin Jambor  
> >
> >        * tree-sra.c (tree_non_mode_aligned_mem_p): Strip conversions and
> >        return false for invariants.
> >
> > Index: src/gcc/tree-sra.c
> > ===
> > --- src.orig/gcc/tree-sra.c
> > +++ src/gcc/tree-sra.c
> > @@ -1075,9 +1075,14 @@ tree_non_mode_aligned_mem_p (tree exp)
> >   enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
> >   unsigned int align;
> >
> > +  while (CONVERT_EXPR_P (exp)
> 
> There can be no convert exprs here, and there can be at most one
> VIEW_CONVERT_EXPR.
> 
> > +        || TREE_CODE (exp) == VIEW_CONVERT_EXPR)
> > +    exp = TREE_OPERAND (exp, 0);
> > +
> >   if (TREE_CODE (exp) == SSA_NAME
> >       || TREE_CODE (exp) == MEM_REF
> >       || mode == BLKmode
> > +      || is_gimple_min_invariant (exp)
> >       || !STRICT_ALIGNMENT)
> >     return false;
> 
> Otherwise ok.
> 

OK, this is what I have just committed as revision 176797 after
re-testing.

Thanks,

Martin


2011-07-26  Martin Jambor  

* tree-sra.c (tree_non_mode_aligned_mem_p): Strip conversions and
return false for invariants.

Index: src/gcc/tree-sra.c
===
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -1075,9 +1075,13 @@ tree_non_mode_aligned_mem_p (tree exp)
   enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
   unsigned int align;
 
+  if (TREE_CODE (exp) == VIEW_CONVERT_EXPR)
+exp = TREE_OPERAND (exp, 0);
+
   if (TREE_CODE (exp) == SSA_NAME
   || TREE_CODE (exp) == MEM_REF
   || mode == BLKmode
+  || is_gimple_min_invariant (exp)
   || !STRICT_ALIGNMENT)
 return false;
 


Re: Patch committed: Fix demangler crash

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 9:46 AM, H.J. Lu  wrote:
> On Tue, Jul 26, 2011 at 7:30 AM, Ian Lance Taylor  wrote:
>> binutils PR 13030 reports a demangler crash on the symbol
>>    _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
>>
>> As far as I can tell, this symbol is invalid.  The final T0_ refers to
>> template argument 1, but this zero-based index has no referent since the
>> template only has one parameter.  This of course suggests a compiler
>> bug.  CC'ing Jason because this involves template packs which I haven't
>> looked into very much.
>>
>> I committed this patch to avoid the crash in the demangler.
>>
>> Ian
>>
>>
>> 2011-07-26  Ian Lance Taylor  
>>
>>        * cp-demangle.c (d_print_init): Initialize pack_index field.
>>        (d_print_comp): Check for NULL template argument.
>>        * testsuite/demangle-expected: Add test case.
>>
>>
>>
>
> I think it caused:
>
> FAIL at line 4023: unknown demangling style
> _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
> FAIL at line 4027: unknown demangling style yz.qrs
> FAIL at line 4031: unknown demangling style oper."+"
> FAIL at line 4035: unknown demangling style yz.qrs
> FAIL at line 4039: unknown demangling style yz.qrs.tuv
> FAIL at line 4042: unknown demangling style yz.qrs.tuv
> FAIL at line 4045: unknown demangling style yz.qrs.tuv
> FAIL at line 4049: unknown demangling style yz.qrs.tuv
> FAIL at line 4053: unknown demangling style 
> FAIL at line 4056: unknown demangling style x.m1
> FAIL at line 4059: unknown demangling style x.m3
> FAIL at line 4062: unknown demangling style x.y.m2
> FAIL at line 4066: unknown demangling style x.y.z.r
> FAIL at line 4070: unknown demangling style x.y.j
> FAIL at line 4074: unknown demangling style x.m3
> FAIL at line 4078: unknown demangling style p'Elab_Body
> FAIL at line 4082: unknown demangling style p'Elab_Spec
> FAIL at line 4086: unknown demangling style p.taskobj
> FAIL at line 4090: unknown demangling style p.taskobj.f1
> FAIL at line 4093: unknown demangling style prot.lock.get
> FAIL at line 4096: unknown demangling style prot.lock.get
> FAIL at line 4099: unknown demangling style prot.lock.get.sub
> FAIL at line 4102: unknown demangling style prot.lock.set
> FAIL at line 4106: unknown demangling style prot.lock.set
> FAIL at line 4109: unknown demangling style prot.lock.update
> FAIL at line 4113: unknown demangling style prot.lock.update
> FAIL at line 4116: unknown demangling style
> gnat.sockets.sockets_library_controller.Finalize
> FAIL at line 4120: unknown demangling style
> system.partition_interface.racw_stub_type.Adjust
> FAIL at line 4123: unknown demangling style
> gnat.wide_wide_string_split.slice_set'Read
> FAIL at line 4126: unknown demangling style
> ada.real_time.timing_events.events.list'Write
> FAIL at line 4129: unknown demangling style
> system.finalization_root.root_controlled'Input
> FAIL at line 4133: unknown demangling style
> ada.finalization.limited_controlled'Output
> FAIL at line 4136: unknown demangling style ada.synchronous_task_control'Size
> FAIL at line 4139: unknown demangling style
> ada.real_time.timing_events.events'Alignment
> FAIL at line 4144: unknown demangling style system.finalization_root.":="
> FAIL at line 4149: unknown demangling style DFA
> FAIL at line 4152: unknown demangling style
> Psi::VariantDetail::SelectVisitorResult const*)#2}&, VariantTest::TestVisit::test_method()::{lambda(char)#3}&,
> VariantTest::TestVisit::test_method()::{lambda(Psi::None)#1}&>::type
> Psi::Variant const*>::visit const*)#2}&,
> VariantTest::TestVisit::test_method()::{lambda(char)#3}&,
> VariantTest::TestVisit::test_method()::{lambda(Psi::None)#1}&>((VariantTest::TestVisit::test_method()::{lambda(Psi::None)#1}&)...)
>
> on Linux/ia32.
>

I checked in this as an obvious fix.

-- 
H.J.
---
diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 1ceb0ee..2c5b761 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,7 @@
+2011-07-26  H.J. Lu  
+
+   * testsuite/demangle-expected: Remove an extra line.
+
 2011-07-26  Ian Lance Taylor  

* cp-demangle.c (d_print_init): Initialize pack_index field.
diff --git a/libiberty/testsuite/demangle-expected
b/libiberty/testsuite/demangle-expected
index d3e7099..f9e8447 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4014,7 +4014,6 @@ K<1, &S::m>::f()
 --format=gnu-v3
 _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
 _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
-_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
 #
 # Ada (GNAT) tests.
 #


Re: Merge alignments from coalesced SSA pointers

2011-07-26 Thread Ulrich Weigand
Michael Matz wrote:
> On Tue, 26 Jul 2011, Michael Matz wrote:
> > On Tue, 26 Jul 2011, Ulrich Weigand wrote:
> > 
> > > > Well, REG_ATTRS->decl is again a decl, not an SSA name.  I suppose
> > > > we'd need to pick a conservative REGNO_POINTER_ALIGN during
> > > > expansion of the SSA name partition - iterate over all of them in the
> > > > partition and pick the lowest alignment.  Or even adjust the 
> > > > partitioning
> > > > to avoid losing alignment information that way.
> > > 
> > > That would certainly be helpful.
> > 
> > I'm working on a patch for that, stay tuned.
> 
> Like so.  Currently in regstrapping on x86_64-linux.  Could you try if it 
> helps spu?

Well, it does help SPU in the sense that the wrong code generation goes away.

However, it does so by setting REGNO_POINTER_ALIGN to the minimum of 8 just
about every time -- not sure what the impact on generated code quality is.

Maybe get_pointer_alignment should default to the type's alignment if
nothing more specific is known, at least on STRICT_ALIGNMENT targets?
Just like MEM_ALIGN defaults to the mode's alignment ...

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: Patch committed: Fix demangler crash

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 7:30 AM, Ian Lance Taylor  wrote:
> binutils PR 13030 reports a demangler crash on the symbol
>    _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
>
> As far as I can tell, this symbol is invalid.  The final T0_ refers to
> template argument 1, but this zero-based index has no referent since the
> template only has one parameter.  This of course suggests a compiler
> bug.  CC'ing Jason because this involves template packs which I haven't
> looked into very much.
>
> I committed this patch to avoid the crash in the demangler.
>
> Ian
>
>
> 2011-07-26  Ian Lance Taylor  
>
>        * cp-demangle.c (d_print_init): Initialize pack_index field.
>        (d_print_comp): Check for NULL template argument.
>        * testsuite/demangle-expected: Add test case.
>
>
>

I think it caused:

FAIL at line 4023: unknown demangling style
_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
FAIL at line 4027: unknown demangling style yz.qrs
FAIL at line 4031: unknown demangling style oper."+"
FAIL at line 4035: unknown demangling style yz.qrs
FAIL at line 4039: unknown demangling style yz.qrs.tuv
FAIL at line 4042: unknown demangling style yz.qrs.tuv
FAIL at line 4045: unknown demangling style yz.qrs.tuv
FAIL at line 4049: unknown demangling style yz.qrs.tuv
FAIL at line 4053: unknown demangling style 
FAIL at line 4056: unknown demangling style x.m1
FAIL at line 4059: unknown demangling style x.m3
FAIL at line 4062: unknown demangling style x.y.m2
FAIL at line 4066: unknown demangling style x.y.z.r
FAIL at line 4070: unknown demangling style x.y.j
FAIL at line 4074: unknown demangling style x.m3
FAIL at line 4078: unknown demangling style p'Elab_Body
FAIL at line 4082: unknown demangling style p'Elab_Spec
FAIL at line 4086: unknown demangling style p.taskobj
FAIL at line 4090: unknown demangling style p.taskobj.f1
FAIL at line 4093: unknown demangling style prot.lock.get
FAIL at line 4096: unknown demangling style prot.lock.get
FAIL at line 4099: unknown demangling style prot.lock.get.sub
FAIL at line 4102: unknown demangling style prot.lock.set
FAIL at line 4106: unknown demangling style prot.lock.set
FAIL at line 4109: unknown demangling style prot.lock.update
FAIL at line 4113: unknown demangling style prot.lock.update
FAIL at line 4116: unknown demangling style
gnat.sockets.sockets_library_controller.Finalize
FAIL at line 4120: unknown demangling style
system.partition_interface.racw_stub_type.Adjust
FAIL at line 4123: unknown demangling style
gnat.wide_wide_string_split.slice_set'Read
FAIL at line 4126: unknown demangling style
ada.real_time.timing_events.events.list'Write
FAIL at line 4129: unknown demangling style
system.finalization_root.root_controlled'Input
FAIL at line 4133: unknown demangling style
ada.finalization.limited_controlled'Output
FAIL at line 4136: unknown demangling style ada.synchronous_task_control'Size
FAIL at line 4139: unknown demangling style
ada.real_time.timing_events.events'Alignment
FAIL at line 4144: unknown demangling style system.finalization_root.":="
FAIL at line 4149: unknown demangling style DFA
FAIL at line 4152: unknown demangling style
Psi::VariantDetail::SelectVisitorResult::type
Psi::Variant::visit((VariantTest::TestVisit::test_method()::{lambda(Psi::None)#1}&)...)

on Linux/ia32.

-- 
H.J.


Re: [PLUGIN] compile and install gengtype, install gtype.state

2011-07-26 Thread Romain Geissler
2011/7/25 Jakub Jelinek :
> On Mon, Jul 25, 2011 at 09:10:55PM +0200, Romain Geissler wrote:
>> > 2011-07-18  Romain Geissler  
>> >
>> >     * gengtype-state.c (#include "bconfig.h"): Include "bconfig.h"
>> >     if GENERATOR_FILE is defined, "config.h" otherwise.
>
> Still not right, this should have been
>        * gengtype-state.c: Include "bconfig.h" if GENERATOR_FILE is
>        define, "config.h" otherwise.
>
>> >     * gengtype.c: Likewise.
>> >     * gengtype-lex.l: Likewise.
>> >     * gengtype-parse.c: Likewise.
>
>> >     * Makefile.in (gengtype): Compile and install for host when
>> >     $enable_plugins is set to "yes".
>> >     (gtype.state): Install when $enable_plugins is set to "yes".
>
> And this should list all the Makefile.in goals you've changed, added etc.
>
> Ok with those changes.
>
>        Jakub
>

Is it OK with this changelog ? If yes, can someone apply the patch,
as i don't have write access.

Romain Geissler.

2011-07-18  Romain Geissler  

* gengtype-state.c: Include "bconfig.h" if
GENERATOR_FILE is defined, "config.h" otherwise.
* gengtype.c: Likewise.
* gengtype-lex.l: Likewise.
* gengtype-parse.c: Likewise.
* Makefile.in (gengtype-lex.o-warn): New variable.
(plugindir): Likewise.
(plugin_bindir): Likewise.
(plugin_includedir): Use $(plugindir) as prefix base.
(MOSTLYCLEANFILES): Add gengtype$(exeext).
(native): Depend on gengtype$(exeext) is $enable_plugin
is set to "yes".
(gtype.state): Depend on s-gtype. Use temporary file.
(gengtype-lex.o): New rule.
(gengtype-parse.o): Likewise.
(gengtype-state.o): Likewise.
(gengtype$(exeext)): Likewise.
(install-gengtype): Likewise.
(gengtype.o): Likewise.
(build/gengtype.o): Depend on version.h.
(build/gengtype-state): Depend on double-int.h, version.h,
$(HASHTAB_H), $(OBSTACK_H), $(XREGEX_H) and build/errors.o.
(install-plugin): Depend on install-gengtype.


Index: gcc/gengtype-state.c
===
--- gcc/gengtype-state.c(revision 175907)
+++ gcc/gengtype-state.c(working copy)
@@ -23,7 +23,11 @@
and Basile Starynkevitch 
 */

+#ifdef GENERATOR_FILE
 #include "bconfig.h"
+#else
+#include "config.h"
+#endif
 #include "system.h"
 #include "errors.h"/* For fatal.  */
 #include "double-int.h"
Index: gcc/gengtype.c
===
--- gcc/gengtype.c  (revision 175907)
+++ gcc/gengtype.c  (working copy)
@@ -18,7 +18,11 @@
along with GCC; see the file COPYING3.  If not see
.  */

+#ifdef GENERATOR_FILE
 #include "bconfig.h"
+#else
+#include "config.h"
+#endif
 #include "system.h"
 #include "errors.h"/* for fatal */
 #include "getopt.h"
Index: gcc/gengtype-lex.l
===
--- gcc/gengtype-lex.l  (revision 175907)
+++ gcc/gengtype-lex.l  (working copy)
@@ -22,7 +22,11 @@ along with GCC; see the file COPYING3.
 %option noinput

 %{
+#ifdef GENERATOR_FILE
 #include "bconfig.h"
+#else
+#include "config.h"
+#endif
 #include "system.h"

 #define malloc xmalloc
Index: gcc/gengtype-parse.c
===
--- gcc/gengtype-parse.c(revision 175907)
+++ gcc/gengtype-parse.c(working copy)
@@ -17,7 +17,11 @@
along with GCC; see the file COPYING3.  If not see
.  */

+#ifdef GENERATOR_FILE
 #include "bconfig.h"
+#else
+#include "config.h"
+#endif
 #include "system.h"
 #include "gengtype.h"

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 175907)
+++ gcc/Makefile.in (working copy)
@@ -192,6 +192,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D
 # be subject to -Werror:
 # flex output may yield harmless "no previous prototype" warnings
 build/gengtype-lex.o-warn = -Wno-error
+gengtype-lex.o-warn = -Wno-error
 # mips-tfile.c contains -Wcast-qual warnings.
 mips-tfile.o-warn = -Wno-error
 expmed.o-warn = -Wno-error
@@ -566,8 +567,12 @@ libexecdir = @libexecdir@
 libsubdir = $(libdir)/gcc/$(target_noncanonical)/$(version)
 # Directory in which the compiler finds executables
 libexecsubdir = $(libexecdir)/gcc/$(target_noncanonical)/$(version)
-# Directory in which plugin headers are installed
-plugin_includedir = $(libsubdir)/plugin/include
+# Directory in which all plugin resources are installed
+plugindir = $(libsubdir)/plugin
+ # Directory in which plugin headers are installed
+plugin_includedir = $(plugindir)/include
+# Directory in which plugin specific executables are installed
+plugin_bindir = $(plugindir)/bin
 # Used to produce a relative $(gcc_tooldir) in gcc.o
 unlibsubdir = ../../..
 # $(prefix), expressed as a path relative to $(libsubdir).
@@ -

Re: Merge alignments from coalesced SSA pointers

2011-07-26 Thread Michael Matz
Hi,

On Tue, 26 Jul 2011, Michael Matz wrote:

> On Tue, 26 Jul 2011, Michael Matz wrote:
> 
> > Hi,
> > 
> > On Tue, 26 Jul 2011, Ulrich Weigand wrote:
> > 
> > > > Well, REG_ATTRS->decl is again a decl, not an SSA name.  I suppose
> > > > we'd need to pick a conservative REGNO_POINTER_ALIGN during
> > > > expansion of the SSA name partition - iterate over all of them in the
> > > > partition and pick the lowest alignment.  Or even adjust the 
> > > > partitioning
> > > > to avoid losing alignment information that way.
> > > 
> > > That would certainly be helpful.
> > 
> > I'm working on a patch for that, stay tuned.
> 
> Like so.  Currently in regstrapping on x86_64-linux.  Could you try if it 
> helps spu?
> 
> Okay for trunk?

This patch exposes a problem in libada.  But I'd still be interested if it 
fixes the spu problem.


Ciao,
Michael.


[obvious] remove some dead df-scan code

2011-07-26 Thread Michael Matz
Hi,

jimis noticed this on IRC.  Since r160348 there's some dead code in 
df_update_entry_block_defs and df_update_exit_block_uses.  I'm currently 
regstrapping this together with the pointer alignment merging patch on 
x86_64-linux, and am going to commit it as obvious when that works.


Ciao,
Michael.

* df-scan.c (df_update_entry_block_defs, df_update_exit_block_uses):
Remove dead code.

Index: df-scan.c
===
*** df-scan.c   (revision 176790)
--- df-scan.c   (working copy)
*** void
*** 3849,3884 
  df_update_entry_block_defs (void)
  {
bitmap_head refs;
-   bool changed = false;
  
bitmap_initialize (&refs, &df_bitmap_obstack);
df_get_entry_block_def_set (&refs);
!   if (df->entry_block_defs)
! {
!   if (!bitmap_equal_p (df->entry_block_defs, &refs))
!   {
! struct df_scan_bb_info *bb_info = df_scan_get_bb_info (ENTRY_BLOCK);
! df_ref_chain_delete_du_chain (bb_info->artificial_defs);
! df_ref_chain_delete (bb_info->artificial_defs);
! bb_info->artificial_defs = NULL;
! changed = true;
!   }
! }
!   else
! {
!   struct df_scan_problem_data *problem_data
!   = (struct df_scan_problem_data *) df_scan->problem_data;
!   gcc_unreachable ();
!   df->entry_block_defs = BITMAP_ALLOC (&problem_data->reg_bitmaps);
!   changed = true;
! }
! 
!   if (changed)
  {
df_record_entry_block_defs (&refs);
bitmap_copy (df->entry_block_defs, &refs);
df_set_bb_dirty (BASIC_BLOCK (ENTRY_BLOCK));
  }
bitmap_clear (&refs);
  }
  
--- 3849,3868 
  df_update_entry_block_defs (void)
  {
bitmap_head refs;
  
bitmap_initialize (&refs, &df_bitmap_obstack);
df_get_entry_block_def_set (&refs);
!   if (!bitmap_equal_p (df->entry_block_defs, &refs))
  {
+   struct df_scan_bb_info *bb_info = df_scan_get_bb_info (ENTRY_BLOCK);
+   df_ref_chain_delete_du_chain (bb_info->artificial_defs);
+   df_ref_chain_delete (bb_info->artificial_defs);
+   bb_info->artificial_defs = NULL;
df_record_entry_block_defs (&refs);
bitmap_copy (df->entry_block_defs, &refs);
df_set_bb_dirty (BASIC_BLOCK (ENTRY_BLOCK));
  }
+ 
bitmap_clear (&refs);
  }
  
*** void
*** 4023,4058 
  df_update_exit_block_uses (void)
  {
bitmap_head refs;
-   bool changed = false;
  
bitmap_initialize (&refs, &df_bitmap_obstack);
df_get_exit_block_use_set (&refs);
!   if (df->exit_block_uses)
! {
!   if (!bitmap_equal_p (df->exit_block_uses, &refs))
!   {
! struct df_scan_bb_info *bb_info = df_scan_get_bb_info (EXIT_BLOCK);
! df_ref_chain_delete_du_chain (bb_info->artificial_uses);
! df_ref_chain_delete (bb_info->artificial_uses);
! bb_info->artificial_uses = NULL;
! changed = true;
!   }
! }
!   else
! {
!   struct df_scan_problem_data *problem_data
!   = (struct df_scan_problem_data *) df_scan->problem_data;
!   gcc_unreachable ();
!   df->exit_block_uses = BITMAP_ALLOC (&problem_data->reg_bitmaps);
!   changed = true;
! }
! 
!   if (changed)
  {
df_record_exit_block_uses (&refs);
bitmap_copy (df->exit_block_uses,& refs);
df_set_bb_dirty (BASIC_BLOCK (EXIT_BLOCK));
  }
bitmap_clear (&refs);
  }
  
--- 4007,4026 
  df_update_exit_block_uses (void)
  {
bitmap_head refs;
  
bitmap_initialize (&refs, &df_bitmap_obstack);
df_get_exit_block_use_set (&refs);
!   if (!bitmap_equal_p (df->exit_block_uses, &refs))
  {
+   struct df_scan_bb_info *bb_info = df_scan_get_bb_info (EXIT_BLOCK);
+   df_ref_chain_delete_du_chain (bb_info->artificial_uses);
+   df_ref_chain_delete (bb_info->artificial_uses);
+   bb_info->artificial_uses = NULL;
df_record_exit_block_uses (&refs);
bitmap_copy (df->exit_block_uses,& refs);
df_set_bb_dirty (BASIC_BLOCK (EXIT_BLOCK));
  }
+ 
bitmap_clear (&refs);
  }
  


[PATCH, i386]: Fix inconsistency in LEA splitters

2011-07-26 Thread Uros Bizjak
Hello!

Using mode iterator, we can use x86_64_nonmemory_operand where
appropriate and remove extra pattern (and ??? comment) on the way.

2011-07-26  Uros Bizjak  

* config/i386/i386.md (add->lea splitter): Implement using SWI
mode iterator.  Change operand 2 predicate to .
(add->lea zext splitter): Change operand 2 predicate to
x86_64_nonmemory_operand.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,
-m32}. Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 176790)
+++ config/i386/i386.md (working copy)
@@ -5805,17 +5805,14 @@
 
 ;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
-  [(set (match_operand 0 "register_operand" "")
-   (plus (match_operand 1 "register_operand" "")
-  (match_operand 2 "nonmemory_operand" "")))
+  [(set (match_operand:SWI 0 "register_operand" "")
+   (plus (match_operand:SWI 1 "register_operand" "")
+  (match_operand:SWI 2 "" "")))
(clobber (reg:CC FLAGS_REG))]
-  "GET_MODE (operands[0]) == GET_MODE (operands[1])
-   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
-   || GET_MODE (operands[2]) == VOIDmode)
-   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  enum machine_mode mode = GET_MODE (operands[0]);
+  enum machine_mode mode = mode;
   rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
@@ -5833,27 +5830,13 @@
 })
 
 ;; Convert add to the lea pattern to avoid flags dependency.
-;; ??? This pattern handles immediate operands that do not satisfy immediate
-;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
-   (plus:DI (match_operand:DI 1 "register_operand" "")
-(match_operand:DI 2 "x86_64_immediate_operand" "")))
-   (clobber (reg:CC FLAGS_REG))]
-  "TARGET_64BIT && reload_completed 
-   && true_regnum (operands[0]) != true_regnum (operands[1])"
-  [(set (match_dup 0)
-   (plus:DI (match_dup 1) (match_dup 2)))])
-
-;; Convert add to the lea pattern to avoid flags dependency.
-(define_split
-  [(set (match_operand:DI 0 "register_operand" "")
(zero_extend:DI
  (plus:SI (match_operand:SI 1 "register_operand" "")
-  (match_operand:SI 2 "nonmemory_operand" ""
+  (match_operand:SI 2 "x86_64_nonmemory_operand" ""
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_64BIT && reload_completed
-   && ix86_lea_for_add_ok (insn, operands)"
+  "TARGET_64BIT && reload_completed && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2])
 


Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 10:43, Richard Guenther  wrote:
> Required?  Do you mean you generate wrong code in graphite if it is
> not run?

Graphite won't generate wrong code if scev_cprop is not run: it will
fail on an assert in the code generation part.  We used to know the
scev for the induction variable (in the scop detection), and after the
detection of reductions we just decided that the IV was a reduction
(its value is used after the loop in the loop close phi), so we decide
to rewrite the IV using arrays to expose this "reduction" to the
graphite representation.  Finally when we arrive in the code gen,
and asking again for the scev of the IV, we get a "dont_know", and
so we are puzzled and abort.

> What if it just fails to handle things or does not handle
> them because things are too expensive?
>
> Please make graphite more robust instead.

Ok, in this case, what about setting gloog_error and stopping the code
generation instead of failing on this gcc_assert.

Thanks for your insistence ;-)

Sebastian


Re: [Patch, i386, testsuite] Fix for PR49547, new tescases for lzcnt instruction

2011-07-26 Thread Uros Bizjak
On Tue, Jul 26, 2011 at 5:40 PM, Mike Stump  wrote:
> On Jul 26, 2011, at 7:50 AM, Kirill Yukhin  wrote:
>> I've also prepared a bunch of tests for lzcnt instuction generation..
>
>> /ChangeLog entry:
>> 2011-07-26  Kirill Yukhin  
>>
>>    * lib/target-supports.exp (check_lzcnt_hw_available): New.
>>    (check_effective_target_lzcnt_runtime): Likewise.
>>    (check_effective_target_lzcnt): Likewise.
>
> For target supports, could you add an x86 to the name somewhere...

Actually, please implement LZCNT detection in gcc.target/i386
directory, since it is used only in this directory. You wull need only
check_effective_target_lzcnt in gcc.target/i386/i386.exp, runtime
detection should be handled in lzcnt-check.h. Oh, bonus points if you
implement avx-os-support.h (to check OS support with xgetbv insn) and
use it in all *-check.h that check for AVX runtime (including your new
lzcnt-check.h).

Other than that, your new option should be tested in
gcc.target/i386/sse-12.c, gcc.target/i386/sse-13.c,
gcc.target/i386/sse-14.c, g++.dg/other/i386-2.C and
g++.dg/other/i386-3.C. See these testcases for further details.

Uros.


Merge alignments from coalesced SSA pointers

2011-07-26 Thread Michael Matz
On Tue, 26 Jul 2011, Michael Matz wrote:

> Hi,
> 
> On Tue, 26 Jul 2011, Ulrich Weigand wrote:
> 
> > > Well, REG_ATTRS->decl is again a decl, not an SSA name.  I suppose
> > > we'd need to pick a conservative REGNO_POINTER_ALIGN during
> > > expansion of the SSA name partition - iterate over all of them in the
> > > partition and pick the lowest alignment.  Or even adjust the partitioning
> > > to avoid losing alignment information that way.
> > 
> > That would certainly be helpful.
> 
> I'm working on a patch for that, stay tuned.

Like so.  Currently in regstrapping on x86_64-linux.  Could you try if it 
helps spu?

Okay for trunk?

Ciao,
Michael.
* cfgexpand.c (expand_one_register_var): Use get_pointer_alignment.
(gimple_expand_cfg): Merge alignment info for coalesced pointer
SSA names.

Index: cfgexpand.c
===
--- cfgexpand.c (revision 176790)
+++ cfgexpand.c (working copy)
@@ -909,7 +909,7 @@ expand_one_register_var (tree var)
 mark_user_reg (x);
 
   if (POINTER_TYPE_P (type))
-mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
+mark_reg_pointer (x, get_pointer_alignment (var, BIGGEST_ALIGNMENT));
 }
 
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL that
@@ -4265,6 +4265,25 @@ gimple_expand_cfg (void)
}
 }
 
+  /* If we have a class containing differently aligned pointers
+ we need to merge those into the corresponding RTL pointer
+ alignment.  */
+  for (i = 1; i < num_ssa_names; i++)
+{
+  tree name = ssa_name (i);
+  int part;
+  rtx r;
+
+  if (!name || !POINTER_TYPE_P (TREE_TYPE (name)))
+   continue;
+  part = var_to_partition (SA.map, name);
+  if (part == NO_PARTITION)
+   continue;
+  r = SA.partition_to_pseudo[part];
+  if (REG_P (r))
+   mark_reg_pointer (r, get_pointer_alignment (name, BIGGEST_ALIGNMENT));
+}
+
   /* If this function is `main', emit a call to `__main'
  to run global initializers, etc.  */
   if (DECL_NAME (current_function_decl)


Re: Patch committed: Fix demangler crash

2011-07-26 Thread Ian Lance Taylor
"H.J. Lu"  writes:

> On Tue, Jul 26, 2011 at 7:30 AM, Ian Lance Taylor  wrote:
>> binutils PR 13030 reports a demangler crash on the symbol
>>    _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
>>
>> As far as I can tell, this symbol is invalid.  The final T0_ refers to
>> template argument 1, but this zero-based index has no referent since the
>> template only has one parameter.  This of course suggests a compiler
>> bug.  CC'ing Jason because this involves template packs which I haven't
>> looked into very much.
>>
>> I committed this patch to avoid the crash in the demangler.
>>
>> Ian
>>
>>
>> 2011-07-26  Ian Lance Taylor  
>>
>>        * cp-demangle.c (d_print_init): Initialize pack_index field.
>>        (d_print_comp): Check for NULL template argument.
>>        * testsuite/demangle-expected: Add test case.
>>
>
> Could you please also check it into binutils?

It should be brought over automatically by DJ's libiberty merge.

Ian


Re: Patch committed: Fix demangler crash

2011-07-26 Thread H.J. Lu
On Tue, Jul 26, 2011 at 7:30 AM, Ian Lance Taylor  wrote:
> binutils PR 13030 reports a demangler crash on the symbol
>    _ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
>
> As far as I can tell, this symbol is invalid.  The final T0_ refers to
> template argument 1, but this zero-based index has no referent since the
> template only has one parameter.  This of course suggests a compiler
> bug.  CC'ing Jason because this involves template packs which I haven't
> looked into very much.
>
> I committed this patch to avoid the crash in the demangler.
>
> Ian
>
>
> 2011-07-26  Ian Lance Taylor  
>
>        * cp-demangle.c (d_print_init): Initialize pack_index field.
>        (d_print_comp): Check for NULL template argument.
>        * testsuite/demangle-expected: Add test case.
>

Could you please also check it into binutils?

Thanks.


-- 
H.J.


Re: [RFC] Replace some bitmaps with HARD_REG_SETs - second version

2011-07-26 Thread Dimitrios Apostolou
Bug found at last, it's in the following hunk, the ampersand in 
&exit_block_uses is wrong... :-@




@@ -3951,7 +3949,7 @@ df_get_exit_block_use_set (bitmap exit_b
 {
   rtx tmp = EH_RETURN_STACKADJ_RTX;
   if (tmp && REG_P (tmp))
-   df_mark_reg (tmp, exit_block_uses);
+   df_mark_reg (tmp, &exit_block_uses);
 }
 #endif

@@ -3961,12 +3959,12 @@ df_get_exit_block_use_set (bitmap exit_b
 {
   rtx tmp = EH_RETURN_HANDLER_RTX;
   if (tmp && REG_P (tmp))
-   df_mark_reg (tmp, exit_block_uses);
+   df_mark_reg (tmp, &exit_block_uses);
 }
 #endif

   /* Mark function return value.  */
-  diddle_return_value (df_mark_reg, (void*) exit_block_uses);
+  diddle_return_value (df_mark_reg, (void*) &exit_block_uses);
 }



Thanks to everyone for looking in my code, it seems to be working now, 
failing only on some mudflap tests that I've been told to ignore. Expect 
patch repost soon :-)


FWIW test failures in comparison to trunk are the following, but I'll 
ignore them:


FAIL: libmudflap.cth/pass39-frag.c (rerun 18) execution test
FAIL: libmudflap.cth/pass39-frag.c (rerun 18) output pattern test
FAIL: libmudflap.cth/pass39-frag.c (-static -DSTATIC) (rerun 10) execution test
FAIL: libmudflap.cth/pass39-frag.c (-static -DSTATIC) (rerun 10) output pattern 
test
FAIL: libmudflap.cth/pass39-frag.c (-O2) (rerun 7) execution test
FAIL: libmudflap.cth/pass39-frag.c (-O2) (rerun 7) output pattern test
FAIL: libmudflap.cth/pass39-frag.c (-O3) (rerun 5) execution test
FAIL: libmudflap.cth/pass39-frag.c (-O3) (rerun 5) output pattern test



Dimitris



Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Richard Guenther
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 09:33, Richard Guenther  wrote:
> > On Tue, 26 Jul 2011, Sebastian Pop wrote:
> >
> >> Ping.
> >> Any opinions on this patch?
> >
> > Well, I don't think we should do this.  Why does the user disable
> > scev-const-prop when enabling graphite?  If he does so, fine - he
> > has to live with worse codegen.
> 
> Here is the patch.  Ok for trunk after regstrap?

Required?  Do you mean you generate wrong code in graphite if it is
not run?  What if it just fails to handle things or does not handle
them because things are too expensive?

Please make graphite more robust instead.

Thanks,
Richard.

Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 09:33, Richard Guenther  wrote:
> On Tue, 26 Jul 2011, Sebastian Pop wrote:
>
>> Ping.
>> Any opinions on this patch?
>
> Well, I don't think we should do this.  Why does the user disable
> scev-const-prop when enabling graphite?  If he does so, fine - he
> has to live with worse codegen.

Here is the patch.  Ok for trunk after regstrap?

Thanks,
Sebastian
From 5c1f59fa7e9c5fb5e5960975153cd040d51baab2 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Sat, 23 Jul 2011 23:29:30 -0500
Subject: [PATCH] Fix PR47691: do not run graphite if scev_const_prop has not run before

2011-07-23  Sebastian Pop  

	PR middle-end/47691
	* graphite.c (graphite_initialize): Return false when
	flag_tree_scev_cprop is not set.

	* gfortran.dg/graphite/id-pr47691.f: New.
---
 gcc/ChangeLog   |6 ++
 gcc/graphite.c  |4 
 gcc/testsuite/ChangeLog |5 +
 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f |7 +++
 4 files changed, 22 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9cfa21b..abb5f77 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2011-07-23  Sebastian Pop  
+
+	PR middle-end/47691
+	* graphite.c (graphite_initialize): Return false when
+	flag_tree_scev_cprop is not set.
+
 2011-07-21  Sebastian Pop  
 
 	PR middle-end/47654
diff --git a/gcc/graphite.c b/gcc/graphite.c
index b013447..caba926 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -191,6 +191,10 @@ graphite_initialize (void)
   int ppl_initialized;
 
   if (number_of_loops () <= 1
+
+  /* scev constant propagation is required for Graphite.  */
+  || !flag_tree_scev_cprop
+
   /* FIXME: This limit on the number of basic blocks of a function
 	 should be removed when the SCOP detection is faster.  */
   || n_basic_blocks > PARAM_VALUE (PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a63b647..5f9b79d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2011-07-23  Sebastian Pop  
+
+	PR middle-end/47691
+	* gfortran.dg/graphite/id-pr47691.f: New.
+
 2011-07-21  Sebastian Pop  
 
 	PR middle-end/47654
diff --git a/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
new file mode 100644
index 000..0abbd55
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
@@ -0,0 +1,7 @@
+! { dg-options "-O -fgraphite-identity -ffast-math -fno-tree-scev-cprop" }
+  dimension b(12,8)
+  do i=1,norb
+  end do
+  b(i,j) = 0
+  call rdrsym(b)
+  end
-- 
1.7.4.1



[PATCH] Fix profile estimation

2011-07-26 Thread Richard Guenther

This applies overflow heuristics to maybe_hot_frequency_p to make
sure not everything is thought to be cold (which happens when
building polyhedron AIR with -fwhole-program).

Bootstrapped and tested on x86_64-unknown-linux-gnu, approved by
Honza on IRC.

Richard.

2011-07-26  Richard Guenther  

* predict.c (maybe_hot_frequency_p): Make sure a zero entry-block
frequency makes everything hot.

Index: gcc/predict.c
===
--- gcc/predict.c   (revision 176789)
+++ gcc/predict.c   (working copy)
@@ -124,7 +124,7 @@ maybe_hot_frequency_p (int freq)
   if (profile_status == PROFILE_ABSENT)
 return true;
   if (node->frequency == NODE_FREQUENCY_EXECUTED_ONCE
-  && freq <= (ENTRY_BLOCK_PTR->frequency * 2 / 3))
+  && freq < (ENTRY_BLOCK_PTR->frequency * 2 / 3))
 return false;
   if (freq < ENTRY_BLOCK_PTR->frequency / PARAM_VALUE 
(HOT_BB_FREQUENCY_FRACTION))
 return false;


Re: Use of vector instructions in memmov/memset expanding

2011-07-26 Thread Michael Zolotukhin
Any updates/questions on this?

On 18 July 2011 15:00, Michael Zolotukhin
 wrote:
> Here is a summary - probably, it doesn't cover every single piece in
> the patch, but I tried to describe the major changes. I hope this will
> help you a bit - and of course I'll answer your further questions if
> they appear.
>
> The changes could be logically divided into two parts (though, these
> parts have something in common).
> The first part is changes in target-independent part, in functions
> move_by_pieces() and store_by_pieces() - mostly located in expr.c.
> The second part touches ix86_expand_movmem() and ix86_expand_setmem()
> - mostly located in config/i386/i386.c.
>
> Changes in i386.c (target-dependent part):
> 1) Strategies for cases with known and unknown alignment are separated
> from each other.
> When alignment is known at compile time, we could generate optimized
> code without libcalls.
> When it's unknown, we sometimes could create runtime-checks to reach
> desired alignment, but not always.
> Strategies for atom and generic_32, generic_64 were chosen according
> to set of experiments, strategies in other
> cost models are unchanged (strategies for unknown alignment are copied
> from existing strategies).
> 2) unrolled_loop algorithm was modified - now it uses SSE move-modes,
> if they're available.
> 3) As size of data, moved in one iteration, greatly increased, and
> epilogues became bigger - so some changes were needed in epilogue
> generation. In some cases a special loop (not unrolled) is generated
> in epilogue to avoid slow copying by bytes (changes in
> expand_set_or_movmem_via_loop() and introducing of
> expand_set_or_movmem_via_loop_with_iter() is made for these cases).
> 4) As bigger alignment might be needed than previously, prologue
> generation was also modified.
>
> Changes in expr.c (target-independent part):
> There are two possible strategies now: use of aligned and unaligned
> moves. For each of them a cost model was implemented and the choice is
> made according to the cost of each option. Move-mode choice is made by
> functions widest_mode_for_unaligned_mov() and
> widest_mode_for_aligned_mov().
> Cost estimation is implemented in functions compute_aligned_cost() and
> compute_unaligned_cost().
> Choice between these two strategies and the generation of moves
> themselves are in function move_by_pieces().
>
> Function store_by_pieces() calls set_by_pieces_1() instead of
> store_by_pieces_1(), if this is memset-case (I needed to introduce
> set_by_pieces_1 to separate memset-case from others -
> store_by_pieces_1 is sometimes called for strcpy and some other
> functions, not only for memset).
>
> Set_by_pieces_1() estimates costs of aligned and unaligned strategies
> (as in move_by_pieces() ) and generates moves for memset. Single move
> is generated via
> generate_move_with_mode(). If it's called first time, a promoted value
> (register, filled with one-byte value of memset argument) is generated
> - later calls reuse this value.
>
> Changes in MD-files:
> For generation of promoted values, I made some changes in
> promote_duplicated_reg() and promote_duplicated_reg_to_size(). Expands
> for vec_dup4si and vec_dupv2di were introduced for this too (these
> expands differ from corresponding define_insns - existing define_insn
> work only with registers, while new expands could process memory
> operand as well).
>
> Some code were added to allow generation of MOVQ (with SSE-registers)
> - such moves aren't usual ones, because they use only half of
> xmm-register.
> There was a need to generate such moves explicitly, so I added a
> simple expand to sse.md.
>
>
> On 16 July 2011 03:24, Jan Hubicka  wrote:
>>> > New algorithm for move-mode selection is implemented for move_by_pieces,
>>> > store_by_pieces.
>>> > x86-specific ix86_expand_movmem and ix86_expand_setmem are also changed in
>>> > similar way, x86 cost-models parameters are slightly changed to support
>>> > this. This implementation checks if array's alignment is known at compile
>>> > time and chooses expanding algorithm and move-mode according to it.
>>
>> Can you give some sumary of changes you made?  It would make it a lot easier 
>> to
>> review if it was broken up int the generic changes (with rationaly why they 
>> are
>> needed) and i386 backend changes that I could review then.
>>
>> From first pass through the patch I don't quite see the need for i.e. adding
>> new move patterns when we can output all kinds of SSE moves already.  Will 
>> look
>> more into the patch to see if I can come up with useful comments.
>>
>> Honza
>>
>


Re: [Patch, i386, testsuite] Fix for PR49547, new tescases for lzcnt instruction

2011-07-26 Thread Mike Stump
On Jul 26, 2011, at 7:50 AM, Kirill Yukhin  wrote:
> I've also prepared a bunch of tests for lzcnt instuction generation..

> /ChangeLog entry:
> 2011-07-26  Kirill Yukhin  
> 
>* lib/target-supports.exp (check_lzcnt_hw_available): New.
>(check_effective_target_lzcnt_runtime): Likewise.
>(check_effective_target_lzcnt): Likewise.

For target supports, could you add an x86 to the name somewhere...



Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-26 Thread Richard Henderson
On 07/26/2011 02:48 AM, Georg-Johann Lay wrote:
> Moreover, the original peep2 is not fully correct because it
> maps a 16-bit shift to a 8-bit one.  The correct mapping is
> 
> (set (match_dup 2)
>  (subreg:QI (ashift:HI (zero_extend:HI (match_dup 2))
>(match_dup 1))
> 0))
> 
> instead of
> 
> (set (match_dup 2)
>  (ashift:QI (match_dup 2)
> (match_dup 1)))
> 
> 
> I don't think it makes a difference that late in the
> compilation process, yet I prefer correct semantics.

Why do you think the semantics were wrong?  As long as you
don't define SHIFT_COUNT_TRUNCATES, these are equivalent.


r~


Re: cp-demangle.c regression

2011-07-26 Thread Ian Lance Taylor
"H.J. Lu"  writes:

> One of cp-demangle.c changes in June/July caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49852

Already fixed.  However, there may be a compiler bug in that the symbol
was generated in the first place.

Ian


Re: PR 45819 - possible fix?

2011-07-26 Thread DJ Delorie

> So don't lie to GCC then?  You specify
> 
> struct X { char c; int i; } __attribute__((packed)) x;
> 
> and expect that GCC knows x.i is aligned to 4 bytes!?

The actual header is much more complex than this trivial example.

It also fails with this example, where the port_status[] array *is*
obviously aligned, but the "packed" attribute *also* makes gcc think
the *structure* is misaligned, which is not the case:

struct ehci_regs {  
char x; 

short y;
 
char z; 

unsigned int port_status[0];

} __attribute__ ((packed)); 

The line that fails is the next one:

return *(volatile unsigned int *)status_reg;
   

The user has explicitly told gcc that the pointer is a valid
pointer-to-int.  How else can the user tell gcc it's wrong about
alignment?  I mean, without changing zillions of released kernel
header files?

> Or declare it a bug in volatile handling (which is, after all, not
> very well defined)

I've been working on "fixing" volatile to mean "do what I tell you"
but historically, "volatile" has had lots of leeway in gcc.

> and simply throw away any alignment information we have in that case
> (which would make it an expander bug I guess, unless the arm target does
> something special here).

Various targets choose to honor volatile over gcc's natural bitfield
access rules, and yes, we're doing it in the expander.  Other volatile
access rules could be applied there too.


PATCH: PR target/49853: [x32] PIC doesn't work with external symbol

2011-07-26 Thread H.J. Lu
Hi,

This patch fixes PIC with external symbol and updates
x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand
for x32.

H.J.

2011-07-26  H.J. Lu  

PR target/49853
* config/i386/i386.c (ix86_expand_move): Call convert_to_mode
on legitimize_tls_address return if needed.  Allow ptr_mode for
symbolic operand with PIC.

* config/i386/predicates.md (x86_64_immediate_operand): Always
allow the offsetted memory references for TARGET_X32.
(x86_64_zext_immediate_operand): Likewise.
(x86_64_movabs_operand): Don't allow nonmemory_operand for
TARGET_X32.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3668357..fd00667 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14985,6 +14985,7 @@ void
 ix86_expand_move (enum machine_mode mode, rtx operands[])
 {
   rtx op0, op1;
+  rtx symbol1 = NULL;
   enum tls_model model;
 
   op0 = operands[0];
@@ -15012,27 +15013,35 @@ ix86_expand_move (enum machine_mode mode, rtx 
operands[])
 {
   rtx addend = XEXP (XEXP (op1, 0), 1);
   rtx symbol = XEXP (XEXP (op1, 0), 0);
-  rtx tmp = NULL;
 
   model = SYMBOL_REF_TLS_MODEL (symbol);
   if (model)
-   tmp = legitimize_tls_address (symbol, model, true);
+   symbol1 = legitimize_tls_address (symbol, model, true);
   else if (TARGET_DLLIMPORT_DECL_ATTRIBUTES
   && SYMBOL_REF_DLLIMPORT_P (symbol))
-   tmp = legitimize_dllimport_symbol (symbol, true);
+   symbol1 = legitimize_dllimport_symbol (symbol, true);
 
-  if (tmp)
+  if (symbol1)
{
- tmp = force_operand (tmp, NULL);
- tmp = expand_simple_binop (Pmode, PLUS, tmp, addend,
-op0, 1, OPTAB_DIRECT);
- if (tmp == op0)
+ symbol1 = force_operand (symbol1, NULL);
+ symbol1 = expand_simple_binop (Pmode, PLUS, symbol1, addend,
+op0, 1, OPTAB_DIRECT);
+ if (symbol1 == op0)
return;
}
 }
 
+  if (symbol1)
+{
+  if (GET_MODE (symbol1) != mode)
+   symbol1 = convert_to_mode (mode, symbol1, 1);
+  emit_insn (gen_rtx_SET (VOIDmode, op0, symbol1));
+  return;
+}
+
   if ((flag_pic || MACHOPIC_INDIRECT) 
-   && mode == Pmode && symbolic_operand (op1, Pmode))
+   && (mode == Pmode || mode == ptr_mode)
+   && symbolic_operand (op1, mode))
 {
   if (TARGET_MACHO && !TARGET_64BIT)
{
@@ -15073,13 +15082,15 @@ ix86_expand_move (enum machine_mode mode, rtx 
operands[])
   else
{
  if (MEM_P (op0))
-   op1 = force_reg (Pmode, op1);
- else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, Pmode))
+   op1 = force_reg (mode, op1);
+ else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, mode))
{
  rtx reg = can_create_pseudo_p () ? NULL_RTX : op0;
  op1 = legitimize_pic_address (op1, reg);
  if (op0 == op1)
return;
+ if (GET_MODE (op1) != mode)
+   op1 = convert_to_mode (mode, op1, 1);
}
}
 }
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0515519..7dc690a 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -197,8 +197,10 @@
  if ((ix86_cmodel == CM_SMALL
   || (ix86_cmodel == CM_MEDIUM
   && !SYMBOL_REF_FAR_ADDR_P (op1)))
- && offset < 16*1024*1024
- && trunc_int_for_mode (offset, SImode) == offset)
+ && (TARGET_X32
+ || (offset < 16*1024*1024
+ && (trunc_int_for_mode (offset, SImode)
+ == offset
return true;
  /* For CM_KERNEL we know that all object resist in the
 negative half of 32bits address space.  We may not
@@ -302,8 +304,11 @@
   || (ix86_cmodel == CM_MEDIUM
   && !SYMBOL_REF_FAR_ADDR_P (op1)))
  && CONST_INT_P (op2)
- && trunc_int_for_mode (INTVAL (op2), DImode) > -0x1
- && trunc_int_for_mode (INTVAL (op2), SImode) == INTVAL (op2))
+ && (TARGET_X32
+ || ((trunc_int_for_mode (INTVAL (op2), DImode)
+  > -0x1)
+ && (trunc_int_for_mode (INTVAL (op2), SImode)
+ == INTVAL (op2)
return true;
  /* ??? For the kernel, we may accept adjustment of
 -0x1000, since we know that it will just convert
@@ -393,7 +398,8 @@
 ;; Return true if OP is nonmemory operand acceptable by movabs patterns.
 (define_predicate "x86_64_movabs_operand"
   (if_then_else (not (and (match_test "TARGET_64BIT")
- (match_te

[Patch, i386, testsuite] Fix for PR49547, new tescases for lzcnt instruction

2011-07-26 Thread Kirill Yukhin
Hi,
I've prepared a patch for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49547

I've also prepared a bunch of tests for lzcnt instuction generation.

ChangeLog entry:
2011-07-26  Kirill Yukhin  

PR target/49547
* config/i386/abmintrin.h (head): Added check if __LZCNT__ is defined.
(__lzcnt32): Fixed name according to Spec.
* config/i386/bmiintrin.h (head): Updated year for Copyright.
(__lzcnt_u16): Removed.
(__lzcnt_u32): Removed.
(__lzcnt_u64): Likewise.
* config/i386/cpuid.h: New bit defined.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect
LZCNT feature.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__LZCNT__ if needed.
* config/i386/i386.c (ix86_target_string): New entry to array.
(ix86_option_override_internal): Handling LZCNT option.
(ix86_valid_target_attribute_inner_p): Likewise.
(bdesc_args): built-in for LZCNT is extended to work under
another flag.
* config/i386/i386.h (TARGET_LZCNT): New.
(CLZ_DEFINED_VALUE_AT_ZERO): Updated flag name.
* config/i386/i386.md (clz2): Target fixed.
(clz2_lzcnt): Likewise.
* doc/invoke.texi: Added mention of -mlzcnt option.
* doc/extend.texi: Likewise.


testsuite/ChangeLog entry:
2011-07-26  Kirill Yukhin  

* lib/target-supports.exp (check_lzcnt_hw_available): New.
(check_effective_target_lzcnt_runtime): Likewise.
(check_effective_target_lzcnt): Likewise.
* gcc.target/i386/lzcnt-1.c: New test.
* gcc.target/i386/lzcnt-2.c: Likewise.
* gcc.target/i386/lzcnt-2a.c: Likewise.
* gcc.target/i386/lzcnt-3.c: New test.
* gcc.target/i386/lzcnt-4.c: Likewise.
* gcc.target/i386/lzcnt-4a.c: Likewise.
* gcc.target/i386/lzcnt-5.c: Likewise.
* gcc.target/i386/lzcnt-6.c: Likewise.
* gcc.target/i386/lzcnt-6a.c: Likewise.
* gcc.target/i386/lzcnt-check.h: New driver to run LZCNT-*
tests only if HW available.

Bootstrapped, make-check-ed. No new fails.
OK for trunk?

Thanks, K


lzcnt.gcc.patch
Description: Binary data


Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-26 Thread Richard Guenther
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 09:07, Richard Guenther  wrote:
> >> > Randomly sign-extending stuff looks bogus to me.
> >> > Does graphite operate on infinite precision signed integers?  Or
> >> > does it operate on twos-complement fixed precision integers?
> >>
> >> Graphite represents constants using mpz_t.
> >
> > Not exactly an answer but I guess all mpz_t do have a sign and are
> > of arbitrary precision.  Thus it's wrong to change unsigned + -1U
> > to mpz_t + -1 unless you truncate to unsigneds precision after
> > doing that operation.  Do we properly handle this?
> 
> Graphite is not truncating after conversion of an unsigned expression to 
> mpz_t.
> 
> I still don't see how truncating -1U to its precision changes anything,
> could you explain?

Truncating -1 doesn't matter - it matters that if you perform any
unsigned arithmetic in arbitrary precision signed arithmetic that
you properly truncate after each operation to simulate unsigned
twos-complement wrapping semantic.  And if you did that you wouldn't
need to sign-extend -1U either.

Richard.

Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Richard Guenther
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> Ping.
> Any opinions on this patch?

Well, I don't think we should do this.  Why does the user disable
scev-const-prop when enabling graphite?  If he does so, fine - he
has to live with worse codegen.

Richard.

> Thanks,
> Sebastian
> 
> On Sat, Jul 23, 2011 at 23:40, Sebastian Pop  wrote:
> > This patch makes graphite run the scev_const_prop systematically even
> > when using -fno-tree-scev-cprop.  When scev_const_prop is not applied,
> > there exist close_phi nodes for the main induction variable, making it
> > impossible for graphite to distinguish between reductions and the IVs.
> > So the main IV is translated as it was a reduction, i.e., using copies
> > into temporary arrays, and that makes the scev analysis impossible
> > during code generation.
> >
> > Bootstrapped and tested on amd64-linux.
> >
> > 2011-07-23  Sebastian Pop  
> >
> >        PR middle-end/47691
> >        * graphite.c (graphite_initialize): Call scev_const_prop when
> >        flag_tree_scev_cprop is not set.
> >
> >        * gfortran.dg/graphite/id-pr47691.f: New.
> > ---
> >  gcc/ChangeLog                                   |    6 ++
> >  gcc/graphite.c                                  |    3 +++
> >  gcc/testsuite/ChangeLog                         |    5 +
> >  gcc/testsuite/gfortran.dg/graphite/id-pr47691.f |    7 +++
> >  4 files changed, 21 insertions(+), 0 deletions(-)
> >  create mode 100644 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
> >
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index 9cfa21b..36347d6 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,9 @@
> > +2011-07-23  Sebastian Pop  
> > +
> > +       PR middle-end/47691
> > +       * graphite.c (graphite_initialize): Call scev_const_prop when
> > +       flag_tree_scev_cprop is not set.
> > +
> >  2011-07-21  Sebastian Pop  
> >
> >        PR middle-end/47654
> > diff --git a/gcc/graphite.c b/gcc/graphite.c
> > index b013447..dfe9ca7 100644
> > --- a/gcc/graphite.c
> > +++ b/gcc/graphite.c
> > @@ -201,6 +201,9 @@ graphite_initialize (void)
> >       return false;
> >     }
> >
> > +  if (!flag_tree_scev_cprop)
> > +    scev_const_prop ();
> > +
> >   scev_reset ();
> >   recompute_all_dominators ();
> >   initialize_original_copy_tables ();
> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> > index a63b647..5f9b79d 100644
> > --- a/gcc/testsuite/ChangeLog
> > +++ b/gcc/testsuite/ChangeLog
> > @@ -1,3 +1,8 @@
> > +2011-07-23  Sebastian Pop  
> > +
> > +       PR middle-end/47691
> > +       * gfortran.dg/graphite/id-pr47691.f: New.
> > +
> >  2011-07-21  Sebastian Pop  
> >
> >        PR middle-end/47654
> > diff --git a/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f 
> > b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
> > new file mode 100644
> > index 000..0abbd55
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
> > @@ -0,0 +1,7 @@
> > +! { dg-options "-O -fgraphite-identity -ffast-math -fno-tree-scev-cprop" }
> > +      dimension b(12,8)
> > +      do i=1,norb
> > +      end do
> > +      b(i,j) = 0
> > +      call rdrsym(b)
> > +      end
> > --
> > 1.7.4.1
> >
> >
> 
> 

-- 
Richard Guenther 
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Patch committed: Fix demangler crash

2011-07-26 Thread Ian Lance Taylor
binutils PR 13030 reports a demangler crash on the symbol
_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_

As far as I can tell, this symbol is invalid.  The final T0_ refers to
template argument 1, but this zero-based index has no referent since the
template only has one parameter.  This of course suggests a compiler
bug.  CC'ing Jason because this involves template packs which I haven't
looked into very much.

I committed this patch to avoid the crash in the demangler.

Ian


2011-07-26  Ian Lance Taylor  

* cp-demangle.c (d_print_init): Initialize pack_index field.
(d_print_comp): Check for NULL template argument.
* testsuite/demangle-expected: Add test case.


Index: testsuite/demangle-expected
===
--- testsuite/demangle-expected	(revision 176790)
+++ testsuite/demangle-expected	(working copy)
@@ -4010,6 +4010,12 @@ K<1, &S::m>::f()
 _ZN1KILi1EXadL_ZN1S1m1fEv
 K<1, &S::m>::f()
 #
+# Used to crash -- binutils PR 13030.
+--format=gnu-v3
+_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
+_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
+_ZSt10_ConstructI10CellBorderIS0_EEvPT_DpOT0_
+#
 # Ada (GNAT) tests.
 #
 # Simple test.
Index: cp-demangle.c
===
--- cp-demangle.c	(revision 176790)
+++ cp-demangle.c	(working copy)
@@ -1,5 +1,5 @@
 /* Demangler for g++ V3 ABI.
-   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
Free Software Foundation, Inc.
Written by Ian Lance Taylor .
 
@@ -3306,6 +3306,7 @@ d_print_init (struct d_print_info *dpi, 
   dpi->last_char = '\0';
   dpi->templates = NULL;
   dpi->modifiers = NULL;
+  dpi->pack_index = 0;
   dpi->flush_count = 0;
 
   dpi->callback = callback;
@@ -3893,6 +3894,13 @@ d_print_comp (struct d_print_info *dpi, 
 	struct demangle_component *a = d_lookup_template_argument (dpi, sub);
 	if (a && a->type == DEMANGLE_COMPONENT_TEMPLATE_ARGLIST)
 	  a = d_index_template_argument (a, dpi->pack_index);
+
+	if (a == NULL)
+	  {
+		d_print_error (dpi);
+		return;
+	  }
+
 	sub = a;
 	  }
 


Re: [PATCH] Fix PR47691: always run scev_const_prop before graphite

2011-07-26 Thread Sebastian Pop
Ping.
Any opinions on this patch?

Thanks,
Sebastian

On Sat, Jul 23, 2011 at 23:40, Sebastian Pop  wrote:
> This patch makes graphite run the scev_const_prop systematically even
> when using -fno-tree-scev-cprop.  When scev_const_prop is not applied,
> there exist close_phi nodes for the main induction variable, making it
> impossible for graphite to distinguish between reductions and the IVs.
> So the main IV is translated as it was a reduction, i.e., using copies
> into temporary arrays, and that makes the scev analysis impossible
> during code generation.
>
> Bootstrapped and tested on amd64-linux.
>
> 2011-07-23  Sebastian Pop  
>
>        PR middle-end/47691
>        * graphite.c (graphite_initialize): Call scev_const_prop when
>        flag_tree_scev_cprop is not set.
>
>        * gfortran.dg/graphite/id-pr47691.f: New.
> ---
>  gcc/ChangeLog                                   |    6 ++
>  gcc/graphite.c                                  |    3 +++
>  gcc/testsuite/ChangeLog                         |    5 +
>  gcc/testsuite/gfortran.dg/graphite/id-pr47691.f |    7 +++
>  4 files changed, 21 insertions(+), 0 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 9cfa21b..36347d6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2011-07-23  Sebastian Pop  
> +
> +       PR middle-end/47691
> +       * graphite.c (graphite_initialize): Call scev_const_prop when
> +       flag_tree_scev_cprop is not set.
> +
>  2011-07-21  Sebastian Pop  
>
>        PR middle-end/47654
> diff --git a/gcc/graphite.c b/gcc/graphite.c
> index b013447..dfe9ca7 100644
> --- a/gcc/graphite.c
> +++ b/gcc/graphite.c
> @@ -201,6 +201,9 @@ graphite_initialize (void)
>       return false;
>     }
>
> +  if (!flag_tree_scev_cprop)
> +    scev_const_prop ();
> +
>   scev_reset ();
>   recompute_all_dominators ();
>   initialize_original_copy_tables ();
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index a63b647..5f9b79d 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2011-07-23  Sebastian Pop  
> +
> +       PR middle-end/47691
> +       * gfortran.dg/graphite/id-pr47691.f: New.
> +
>  2011-07-21  Sebastian Pop  
>
>        PR middle-end/47654
> diff --git a/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f 
> b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
> new file mode 100644
> index 000..0abbd55
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/graphite/id-pr47691.f
> @@ -0,0 +1,7 @@
> +! { dg-options "-O -fgraphite-identity -ffast-math -fno-tree-scev-cprop" }
> +      dimension b(12,8)
> +      do i=1,norb
> +      end do
> +      b(i,j) = 0
> +      call rdrsym(b)
> +      end
> --
> 1.7.4.1
>
>


Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 09:07, Richard Guenther  wrote:
>> > Randomly sign-extending stuff looks bogus to me.
>> > Does graphite operate on infinite precision signed integers?  Or
>> > does it operate on twos-complement fixed precision integers?
>>
>> Graphite represents constants using mpz_t.
>
> Not exactly an answer but I guess all mpz_t do have a sign and are
> of arbitrary precision.  Thus it's wrong to change unsigned + -1U
> to mpz_t + -1 unless you truncate to unsigneds precision after
> doing that operation.  Do we properly handle this?

Graphite is not truncating after conversion of an unsigned expression to mpz_t.

I still don't see how truncating -1U to its precision changes anything,
could you explain?

Thanks,
Sebastian


Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-26 Thread Richard Guenther
On Tue, 26 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 03:22, Richard Guenther  wrote:
> > On Mon, 25 Jul 2011, Sebastian Pop wrote:
> >
> >> "Bug 47594 - gfortran.dg/vect/vect-5.f90 execution test fails when
> >> compiled with -O2 -fgraphite-identity"
> >>
> >> The problem is due to the fact that Graphite generates this loop:
> >>
> >>     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
> >>       S6(scat_1,scat_3);
> >>     }
> >>
> >> that has a "-1" encoded as an unsigned "4294967295".  This constant
> >> comes from the computation of the number of iterations "M - I" of
> >> the inner loop:
> >>
> >>         do I = 1, N
> >>           do J = I, M
> >>             A(J,2) = B(J)
> >>           end do
> >>         end do
> >>
> >> The patch fixes the problem by sign-extending the constants for the
> >> step of a chain of recurrence in scan_tree_for_params_right_scev.
> >>
> >> The same patter could occur for multiplication by a scalar, like in
> >> "-1 * N" and so the patch also fixes these cases in
> >> scan_tree_for_params.
> >
> > That certainly feels odd (again).  How does it end up being unsigned
> > in the first place?
> 
> We got this expression from niter.  niter analysis turns all expressions
> into unsigned types before starting computations.  I tried to see if we
> could improve niter, but that would be a major work.  I also thought
> about using PPL or ISL to implement niter for graphite.

Hmm, I see (I suppose to avoid introducing undefined overflow).

> > Randomly sign-extending stuff looks bogus to me.
> > Does graphite operate on infinite precision signed integers?  Or
> > does it operate on twos-complement fixed precision integers?
> 
> Graphite represents constants using mpz_t.

Not exactly an answer but I guess all mpz_t do have a sign and are
of arbitrary precision.  Thus it's wrong to change unsigned + -1U
to mpz_t + -1 unless you truncate to unsigneds precision after
doing that operation.  Do we properly handle this?

Richard.

Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-26 Thread Sebastian Pop
On Tue, Jul 26, 2011 at 03:22, Richard Guenther  wrote:
> On Mon, 25 Jul 2011, Sebastian Pop wrote:
>
>> "Bug 47594 - gfortran.dg/vect/vect-5.f90 execution test fails when
>> compiled with -O2 -fgraphite-identity"
>>
>> The problem is due to the fact that Graphite generates this loop:
>>
>>     for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
>>       S6(scat_1,scat_3);
>>     }
>>
>> that has a "-1" encoded as an unsigned "4294967295".  This constant
>> comes from the computation of the number of iterations "M - I" of
>> the inner loop:
>>
>>         do I = 1, N
>>           do J = I, M
>>             A(J,2) = B(J)
>>           end do
>>         end do
>>
>> The patch fixes the problem by sign-extending the constants for the
>> step of a chain of recurrence in scan_tree_for_params_right_scev.
>>
>> The same patter could occur for multiplication by a scalar, like in
>> "-1 * N" and so the patch also fixes these cases in
>> scan_tree_for_params.
>
> That certainly feels odd (again).  How does it end up being unsigned
> in the first place?

We got this expression from niter.  niter analysis turns all expressions
into unsigned types before starting computations.  I tried to see if we
could improve niter, but that would be a major work.  I also thought
about using PPL or ISL to implement niter for graphite.

> Randomly sign-extending stuff looks bogus to me.
> Does graphite operate on infinite precision signed integers?  Or
> does it operate on twos-complement fixed precision integers?

Graphite represents constants using mpz_t.

Sebastian


cp-demangle.c regression

2011-07-26 Thread H.J. Lu
Hi,

One of cp-demangle.c changes in June/July caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49852

-- 
H.J.


Re: [patch] Fix PR tree-optimization/49771

2011-07-26 Thread Michael Matz
Hi,

On Tue, 26 Jul 2011, Ulrich Weigand wrote:

> > Well, REG_ATTRS->decl is again a decl, not an SSA name.  I suppose
> > we'd need to pick a conservative REGNO_POINTER_ALIGN during
> > expansion of the SSA name partition - iterate over all of them in the
> > partition and pick the lowest alignment.  Or even adjust the partitioning
> > to avoid losing alignment information that way.
> 
> That would certainly be helpful.

I'm working on a patch for that, stay tuned.


Ciao,
Michael.


Re: [patch] Fix PR tree-optimization/49771

2011-07-26 Thread Ulrich Weigand
Richard Guenther wrote:
> On Mon, Jul 25, 2011 at 5:25 PM, Ulrich Weigand  wrote:
> > When would that be?  The expansion does happen in the initial expand
> > stage, but I'm getting called from the middle-end via emit_move_insn etc.
> > which already provides me with a MEM ...
> 
> Hmm.  I suppose we'd need to see at the initial expand stage that the
> move is going to be handled specially.  For other strict-align targets
> we end up with store/load-bit-field for unaligned accesses, so I suppose
> SPU doesn't want to go down that path (via insv/extv)?

One issue here is that accesses aren't necessarily "unaligned" as far as
the middle-end is concerned:  in the current example, we in fact have an
access to a 32-bit integer that is aligned on a 32-bit boundary (which is
the default alignment for integers).  It's just that even so, the address
is not *128-bit* aligned, and all SPU load instructions require this level
of alignment ...

The other issue is that as Andrew mentioned, all this means that just about
every single memory access needs to be handled this way, and attempts to
have everying go through insv/extv in the past have resulted in less efficient
code generation.
 
> > Can I use REG_ATTRS->decl to get at the register's DECL and use
> > get_pointer_alignment on that?  [ On the other hand, don't we have
> > the same problems with reliability of REG_ATTRS that we have with
> > REGNO_POINTER_ALIGN, given e.g. the coalescing you mentioned? ]
> 
> Well, REG_ATTRS->decl is again a decl, not an SSA name.  I suppose
> we'd need to pick a conservative REGNO_POINTER_ALIGN during
> expansion of the SSA name partition - iterate over all of them in the
> partition and pick the lowest alignment.  Or even adjust the partitioning
> to avoid losing alignment information that way.

That would certainly be helpful.

> I suppose the RTL code transforms are careful to update REGNO_POINTER_ALIGN
> conservatively.

They're supposed to, yes.  In practice, REGNO_POINTER_ALIGN is mostly used
for pseudos allocated to hold pointer types (reflecting the type's alignment
requirement) and for virtual/hard registers pointing into the stack (stack,
frame, virtual args, ...), reflecting the various ABI alignment guarantees
about the stack.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: [patch] Fix PR tree-optimization/49471

2011-07-26 Thread Richard Guenther
On Tue, Jul 26, 2011 at 2:59 PM, Razya Ladelsky  wrote:
> Richard Guenther  wrote on 25/07/2011 05:54:28
> PM:
>
>> From: Richard Guenther 
>> To: Razya Ladelsky/Haifa/IBM@IBMIL
>> Cc: gcc-patches@gcc.gnu.org, Zdenek Dvorak
>> , Sebastian Pop 
>> Date: 25/07/2011 05:54 PM
>> Subject: Re: [patch] Fix PR tree-optimization/49471
>>
>> On Mon, Jul 25, 2011 at 4:47 PM, Razya Ladelsky 
> wrote:
>> > Hi,
>> >
>> > This patch fixes the build failure of cactusADM and dealII spec2006
>> > benchmarks when autopar is enabled.
>> > (for powerpc they fail only when -m32 is additionally enabled)
>> >
>> > The problem originated in canonicalize_loop_ivs, where we iterate the
>> > header's phis in order to base all
>> > the induction variables on a single control variable.
>> > We use the largest precision of the loop's ivs in order to determine
> the
>> > type of the control variable.
>> >
>> > Since iterating the loop's phis takes into account not only the loop's
>> > ivs, but also reduction variables,
>> > we got precision values like 80 for x86, or 128 for ppc.
>> > The compilers failed to create proper types for these sizes
>> > (respectively).
>> >
>> > The proper behavior for determining the control variable's type is to
> take
>> > into account only the loop's ivs,
>> > which is what this patch does.
>> >
>> > Bootstrap and testsuite pass successfully (as autopar is not enabled
> by
>> > default).
>> > No new regressions when the testsuite is run with autopar enabled.
>> > No new regressions for the run of spec2006 with autopar enabled,
>> >
>> > cactusADM and dealII benchmarks now pass successfully with autopar on
>> > powerpc and x86.
>> >
>> > Thanks to Zdenek who helped me figure out the failure/fix.
>> > OK for trunk?
>>
>> It'll collide with Sebastians patch in that area.  I suggested a
>> INTEGRAL_TYPE_P check instead of the simple_iv one, it
>> should be cheaper.  Zdenek, do you think it will be "incorrect"
>> in some cases?
>>
>
> The INTEGRAL_TYPE_P check does work for cactusADM and dealII, but
> I'm not sure about the general case.

I suppose we also need to allow POINTER_TYPE_P here (but then
treat it like an unsigned variable of the same width).

Richard.

> Razya
>
>
>> Thanks,
>> Richard.
>>
>> > Thanks,
>> > Razya
>> >
>> > ChangeLog:
>> >
>> >   PR tree-optimization/49471
>> >   * tree-vect-loop-manip.c (canonicalize_loop_ivs): Add condition to
>> >   ignore reduction variables when iterating the loop header's phis.
>> >
>> >
>> >
>
>


[MELT] Add some utils closures.

2011-07-26 Thread Romain Geissler
Hi

I added some new closures i needed for my plugin that i think worth
being shared to all melt users.
I also removed "string=" and "string!=" primitives, as "==s" and "!=s"
just do the same.

Romain Geissler


melt-utils2.Changelog
Description: Binary data


melt-utils2.diff
Description: Binary data


Re: [PATCH] GNU/kFreeBSD systems running on MIPS

2011-07-26 Thread Robert Millan
2011/7/26 Rainer Orth :
> I'm in the middle of moving shlib support (another day), need to rebase
> crtstuff and libgcc1, and finish libgcc2.
>
> I hope to be ready within two or three weeks.

Ok then.  I'd appreciate if you can send me a reminder via private
mail when you've finished.

Best regards

-- 
Robert Millan


Re: ARM: Clear icache when creating a closure

2011-07-26 Thread Andrew Haley
On 07/25/2011 10:33 AM, Andrew Haley wrote:
> On 21/07/11 16:33, Joseph S. Myers wrote:
>> My suggestion would be putting the instruction sequence in a .s file, 
>> rather than hardcoding the instruction encodings here, and writing the 
>> code to read from the sequence as assembled by the assembler.  That way it 
>> will have the appropriate mapping symbols to mark it as ARM-mode code and 
>> the linker will deal with adjusting endianness, so you don't need to test 
>> for BE-8 at all.

I did this, and it passes the testsuite.  Is it OK?

Andrew.


2011-07-26  Andrew Haley  

* src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Remove hard-coded assembly
instructions.
* src/arm/sysv.S (ffi_arm_trampoline): Put them here instead.

Index: src/arm/ffi.c
===
--- src/arm/ffi.c   (revision 176744)
+++ src/arm/ffi.c   (working copy)
@@ -337,14 +337,14 @@

 /* How to make a trampoline.  */

+extern unsigned int ffi_arm_trampoline[3];
+
 #define FFI_INIT_TRAMPOLINE(TRAMP,FUN,CTX) \
 ({ unsigned char *__tramp = (unsigned char*)(TRAMP);   \
unsigned int  __fun = (unsigned int)(FUN);  \
unsigned int  __ctx = (unsigned int)(CTX);  \
unsigned char *insns = (unsigned char *)(CTX);   \
-   *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
-   *(unsigned int*) &__tramp[4] = 0xe59f; /* ldr r0, [pc] */   \
-   *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */   \
+   memcpy (__tramp, ffi_arm_trampoline, sizeof ffi_arm_trampoline); \
*(unsigned int*) &__tramp[12] = __ctx;  \
*(unsigned int*) &__tramp[16] = __fun;  \
__clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
Index: src/arm/sysv.S
===
--- src/arm/sysv.S  (revision 176744)
+++ src/arm/sysv.S  (working copy)
@@ -461,6 +461,11 @@
UNWIND .fnend
 .size
CNAME(ffi_closure_VFP),.ffi_closure_VFP_end-CNAME(ffi_closure_VFP)

+ENTRY(ffi_arm_trampoline)
+   stmfd sp!, {r0-r3}
+   ldr r0, [pc]
+   ldr pc, [pc]
+
 #if defined __ELF__ && defined __linux__
.section.note.GNU-stack,"",%progbits
 #endif


Re: [patch] Fix PR tree-optimization/49471

2011-07-26 Thread Razya Ladelsky
Richard Guenther  wrote on 25/07/2011 05:54:28 
PM:

> From: Richard Guenther 
> To: Razya Ladelsky/Haifa/IBM@IBMIL
> Cc: gcc-patches@gcc.gnu.org, Zdenek Dvorak 
> , Sebastian Pop 
> Date: 25/07/2011 05:54 PM
> Subject: Re: [patch] Fix PR tree-optimization/49471
> 
> On Mon, Jul 25, 2011 at 4:47 PM, Razya Ladelsky  
wrote:
> > Hi,
> >
> > This patch fixes the build failure of cactusADM and dealII spec2006
> > benchmarks when autopar is enabled.
> > (for powerpc they fail only when -m32 is additionally enabled)
> >
> > The problem originated in canonicalize_loop_ivs, where we iterate the
> > header's phis in order to base all
> > the induction variables on a single control variable.
> > We use the largest precision of the loop's ivs in order to determine 
the
> > type of the control variable.
> >
> > Since iterating the loop's phis takes into account not only the loop's
> > ivs, but also reduction variables,
> > we got precision values like 80 for x86, or 128 for ppc.
> > The compilers failed to create proper types for these sizes
> > (respectively).
> >
> > The proper behavior for determining the control variable's type is to 
take
> > into account only the loop's ivs,
> > which is what this patch does.
> >
> > Bootstrap and testsuite pass successfully (as autopar is not enabled 
by
> > default).
> > No new regressions when the testsuite is run with autopar enabled.
> > No new regressions for the run of spec2006 with autopar enabled,
> >
> > cactusADM and dealII benchmarks now pass successfully with autopar on
> > powerpc and x86.
> >
> > Thanks to Zdenek who helped me figure out the failure/fix.
> > OK for trunk?
> 
> It'll collide with Sebastians patch in that area.  I suggested a
> INTEGRAL_TYPE_P check instead of the simple_iv one, it
> should be cheaper.  Zdenek, do you think it will be "incorrect"
> in some cases?
> 

The INTEGRAL_TYPE_P check does work for cactusADM and dealII, but
I'm not sure about the general case.

Razya


> Thanks,
> Richard.
> 
> > Thanks,
> > Razya
> >
> > ChangeLog:
> >
> >   PR tree-optimization/49471
> >   * tree-vect-loop-manip.c (canonicalize_loop_ivs): Add condition to
> >   ignore reduction variables when iterating the loop header's phis.
> >
> >
> >



[PATCH] Fix PR49840

2011-07-26 Thread Richard Guenther

This fixes PR49840 - make sure to properly treat integers as wide
as a double-int correctly in range_fits_type_p.

Bootstrapped and tested on i686-linux-gnu, applied to trunk.

Richard.

2011-07-26  Richard Guenther  

PR tree-optimization/49840
* tree-vrp.c (range_fits_type_p): Properly handle full
double-int precision.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 176786)
--- gcc/tree-vrp.c  (working copy)
*** simplify_conversion_using_ranges (gimple
*** 7423,7440 
  static bool
  range_fits_type_p (value_range_t *vr, unsigned precision, bool unsigned_p)
  {
double_int tem;
  
!   /* We can only handle constant ranges.  */
if (vr->type != VR_RANGE
|| TREE_CODE (vr->min) != INTEGER_CST
|| TREE_CODE (vr->max) != INTEGER_CST)
  return false;
  
tem = double_int_ext (tree_to_double_int (vr->min), precision, unsigned_p);
if (!double_int_equal_p (tree_to_double_int (vr->min), tem))
  return false;
- 
tem = double_int_ext (tree_to_double_int (vr->max), precision, unsigned_p);
if (!double_int_equal_p (tree_to_double_int (vr->max), tem))
  return false;
--- 7456,7495 
  static bool
  range_fits_type_p (value_range_t *vr, unsigned precision, bool unsigned_p)
  {
+   tree src_type;
+   unsigned src_precision;
double_int tem;
  
!   /* We can only handle integral and pointer types.  */
!   src_type = TREE_TYPE (vr->min);
!   if (!INTEGRAL_TYPE_P (src_type)
!   && !POINTER_TYPE_P (src_type))
! return false;
! 
!   /* An extension is always fine, so is an identity transform.  */
!   src_precision = TYPE_PRECISION (TREE_TYPE (vr->min));
!   if (src_precision < precision
!   || (src_precision == precision
! && TYPE_UNSIGNED (src_type) == unsigned_p))
! return true;
! 
!   /* Now we can only handle ranges with constant bounds.  */
if (vr->type != VR_RANGE
|| TREE_CODE (vr->min) != INTEGER_CST
|| TREE_CODE (vr->max) != INTEGER_CST)
  return false;
  
+   /* For precision-preserving sign-changes the MSB of the double-int
+  has to be clear.  */
+   if (src_precision == precision
+   && (TREE_INT_CST_HIGH (vr->min) | TREE_INT_CST_HIGH (vr->max)) < 0)
+ return false;
+ 
+   /* Then we can perform the conversion on both ends and compare
+  the result for equality.  */
tem = double_int_ext (tree_to_double_int (vr->min), precision, unsigned_p);
if (!double_int_equal_p (tree_to_double_int (vr->min), tem))
  return false;
tem = double_int_ext (tree_to_double_int (vr->max), precision, unsigned_p);
if (!double_int_equal_p (tree_to_double_int (vr->max), tem))
  return false;


Re: [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)

2011-07-26 Thread Uros Bizjak
On Mon, Jul 25, 2011 at 11:09 PM, Uros Bizjak  wrote:

> Attached patch implements -fpic handling for x32. In x32 mode, we now
> use x86_64_general_operand and corresponding "e" constraints for adds
> in SImode, since it looks that invalid addresses can only be generated
> through adds. This avoids a whole bunch of new predicates and
> constraints.
>>>
 X32 glibc is miscompiled:

 CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
  -E -x c-header'
 /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
 --library-path 
 /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
 ../scripts -h rpcsvc/yppasswd.x -o
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
 Segmentation fault (core dumped)

 Some LEA patterns are wrong for x32.  I will investigate.
>>>
>>> We have to prevent symbols from entering general_operand predicated
>>> SImode operands. Fortunatelly, x86_64_general_operand works OK for
>>> x32, while both for i686 and x86_64 are unaffected due to early bypass
>>> (i686) and due to the fact that all symbols are DImode (x86_64).
>
>> GCC and glibc testsuites are clean on x32.  Can you check it in?
>
> I will do this tomorrow, if anybody has some comment on the patch.

Committed slightly updated version:

2011-07-26  Uros Bizjak  
H.J. Lu  

PR target/47381
PR target/49832
PR target/49833
* config/i386/i386.md (i): Change SImode attribute to "e".
(g): Change SImode attribute to "rme".
(di): Change SImode attribute to "nF".
(general_operand): Change SImode attribute to x86_64_general_operand.
(general_szext_operand): Change SImode attribute to
x86_64_szext_general_operand.
(immediate_operand): Change SImode attribute to
x86_64_immediate_operand.
(nonmemory_operand): Change SImode attribute to
x86_64_nonmemory_operand.
(*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
(*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
(*lea_1): Use SWI48 mode iterator.
(*lea_1_zext): New insn pattern.
(testsi_ccno_1): Use x86_64_nonmemory_operand predicate for operand 2.
(*bt): Ditto.
(*add1): Use x86_64_general_operand predicate for operand 2.
Update operand constraints.
(addsi_1_zext): Ditto.
(*add2): Ditto.
(*addsi_3_zext): Ditto.
(*subsi_1_zext): Ditto.
(*subsi_2_zext): Ditto.
(*subsi_3_zext): Ditto.
(*addsi3_carry_zext): Ditto.
(*si3_zext_cc_overflow): Ditto.
(*mulsi3_1_zext): Ditto.
(*andsi_1): Ditto.
(*andsi_1_zext): Ditto.
(*andsi_2_zext): Ditto.
(*si_1_zext): Ditto.
(*si_2_zext): Ditto.
(*test_1): Use  predicate for operand 1.
(*and_2): Ditto.
(movcc): Use   predicate for operands 1 and 2.
(add->lea splitter): Check operand modes in insn constraint.  Extend
operands less than SImode wide to SImode.
(add->lea zext splitter): Do not extend input operands to DImode.
(*lea_general_1): Handle only QImode and HImode operands.
(*lea_general_2): Ditto.
(*lea_general_3): Ditto.
(*lea_general_1_zext): Remove.
(*lea_general_2_zext): Ditto.
(*lea_general_3_zext): Ditto.
(*lea_general_4): Check operand modes in insn constraint.  Extend
operands less than SImode wide to SImode.
(ashift->lea splitter): Ditto.
* config/i386/i386.c (ix86_print_operand_address): Print address
registers with 'q' modifier on 64bit targets.
* config/i386/predicates.md (pic_32bit_opreand): Define as special
predicate.  Reject non-SI and non-DI modes.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 176782)
+++ i386.md (working copy)
@@ -861,13 +861,13 @@
 (define_mode_attr r [(QI "q") (HI "r") (SI "r") (DI "r")])
 
 ;; Immediate operand constraint for integer modes.
-(define_mode_attr i [(QI "n") (HI "n") (SI "i") (DI "e")])
+(define_mode_attr i [(QI "n") (HI "n") (SI "e") (DI "e")])
 
 ;; General operand co

Re: [PATCH, PR 49786] Avoid overflow when updating counts in IPA-CP

2011-07-26 Thread Jan Hubicka
> Hi,
> 
> the issue in PR 49786 has been well summarized in comment #12.
> Basically, the multiplication we do when updating counts of edges
> overflows.  I looked at how this was handled previously in IPA-CP and
> the following should be an equivalent solution.
> 
> This patch fixes the LTO profiled bootstrap and I have tested it by
> doing a usual non-LTO and non-profiled bootstrap and testsuite
> (without any issues) run as well as running the testsuite on the
> compiler produced by the LTO profiled bootstrap.  I compared the
> second set of testsuite results with both the results obtained from a
> revision before the IPA-CP submission and the current normal bootstrap
> ones and it seemed fine (although there were a few dump scan
> failures).  (All of the above was done on x86_64-linux.)
> 
> So, OK for trunk?

Ok, thanks
as a general rule we can't do second power of count in 64bit value, so my 
referred
solution is this REG_BR_PROB_BASE fixed point math.

Honza


[PATCH, PR 49786] Avoid overflow when updating counts in IPA-CP

2011-07-26 Thread Martin Jambor
Hi,

the issue in PR 49786 has been well summarized in comment #12.
Basically, the multiplication we do when updating counts of edges
overflows.  I looked at how this was handled previously in IPA-CP and
the following should be an equivalent solution.

This patch fixes the LTO profiled bootstrap and I have tested it by
doing a usual non-LTO and non-profiled bootstrap and testsuite
(without any issues) run as well as running the testsuite on the
compiler produced by the LTO profiled bootstrap.  I compared the
second set of testsuite results with both the results obtained from a
revision before the IPA-CP submission and the current normal bootstrap
ones and it seemed fine (although there were a few dump scan
failures).  (All of the above was done on x86_64-linux.)

So, OK for trunk?

Thanks,

Martin


2011-07-25  Martin Jambor  

PR bootstrap/49786
* ipa-cp.c (update_profiling_info): Avoid overflow when updating
counts.
(update_specialized_profile): Likewise.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1877,7 +1877,6 @@ dump_profile_updates (struct cgraph_node
 cgraph_node_name (cs->callee), (HOST_WIDE_INT) cs->count);
 }
 
-
 /* After a specialized NEW_NODE version of ORIG_NODE has been created, update
their profile information to reflect this.  */
 
@@ -1923,12 +1922,14 @@ update_profiling_info (struct cgraph_nod
 
   for (cs = new_node->callees; cs ; cs = cs->next_callee)
 if (cs->frequency)
-  cs->count = cs->count * new_sum / orig_node_count;
+  cs->count = cs->count * (new_sum * REG_BR_PROB_BASE
+  / orig_node_count) / REG_BR_PROB_BASE;
 else
   cs->count = 0;
 
   for (cs = orig_node->callees; cs ; cs = cs->next_callee)
-cs->count = cs->count * remainder / orig_node_count;
+cs->count = cs->count * (remainder * REG_BR_PROB_BASE
+/ orig_node_count) / REG_BR_PROB_BASE;
 
   if (dump_file)
 dump_profile_updates (orig_node, new_node);
@@ -1966,7 +1967,8 @@ update_specialized_profile (struct cgrap
 
   for (cs = orig_node->callees; cs ; cs = cs->next_callee)
 {
-  gcov_type dec = cs->count * redirected_sum / orig_node_count;
+  gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE
+  / orig_node_count) / REG_BR_PROB_BASE;
   if (dec < cs->count)
cs->count -= dec;
   else


Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)

2011-07-26 Thread Georg-Johann Lay
Georg-Johann Lay wrote:

> I found that too painful, and on devices with >= 8k flash the
> self-tail-call will just save 4 bytes.

That' not correct: even on devices >= 8k an rcall will always reach
the destination, so that the self tail-call always saves 6 bytes.

Johann



[C++ Patch] PR 49776

2011-07-26 Thread Paolo Carlini

Hi,

another simple fix for an ICE on invalid. Tested x86_64-linux.

Ok?

Thanks,
Paolo.


/cp
2011-07-26  Paolo Carlini  

PR c++/49776
* typeck.c (cp_build_modify_expr): Check digest_init return value
for error_mark_node.

/testsuite
2011-07-26  Paolo Carlini  

PR c++/49776
* g++.dg/cpp0x/constexpr-49776.C: New.

Index: testsuite/g++.dg/cpp0x/constexpr-49776.C
===
--- testsuite/g++.dg/cpp0x/constexpr-49776.C(revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-49776.C(revision 0)
@@ -0,0 +1,17 @@
+// PR c++/49776
+// { dg-options -std=c++0x }
+
+struct s
+{
+  int i[1];
+
+  template
+constexpr s(Types... args)
+: i{args...}  // { dg-error "cannot convert" }
+{ }
+};
+
+int main()
+{
+  s test = nullptr;
+}
Index: cp/typeck.c
===
--- cp/typeck.c (revision 176786)
+++ cp/typeck.c (working copy)
@@ -6753,6 +6753,8 @@ cp_build_modify_expr (tree lhs, enum tree_code mod
  if (check_array_initializer (lhs, lhstype, newrhs))
return error_mark_node;
  newrhs = digest_init (lhstype, newrhs, complain);
+ if (newrhs == error_mark_node)
+   return error_mark_node;
}
 
   else if (!same_or_base_type_p (TYPE_MAIN_VARIANT (lhstype),


  1   2   >