Re: fix rs6000 break

2011-03-30 Thread Jakub Jelinek
On Thu, Mar 31, 2011 at 08:30:42AM +0200, Andreas Tobler wrote:
> Hello,
> 
> Is this ok to apply?

That patch is obvious, just commit it.

> 2011-03-31  Andreas Tobler  
> 
>   * config/rs6000/rs6000.c (rs6000_handle_option): Remove unused isel
>   var.
> 
> --- config/rs6000/rs6000.c(revision 171769)
> +++ config/rs6000/rs6000.c(working copy)
> @@ -4192,7 +4192,6 @@
> location_t loc ATTRIBUTE_UNUSED)
>  {
>enum fpu_type_t fpu_type = FPU_NONE;
> -  int isel;
>char *p, *q;
>size_t code = decoded->opt_index;
>const char *arg = decoded->arg;

Jakub


fix rs6000 break

2011-03-30 Thread Andreas Tobler
Hello,

Is this ok to apply?


2011-03-31  Andreas Tobler  

* config/rs6000/rs6000.c (rs6000_handle_option): Remove unused isel
var.

Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c  (revision 171769)
+++ config/rs6000/rs6000.c  (working copy)
@@ -4192,7 +4192,6 @@
  location_t loc ATTRIBUTE_UNUSED)
 {
   enum fpu_type_t fpu_type = FPU_NONE;
-  int isel;
   char *p, *q;
   size_t code = decoded->opt_index;
   const char *arg = decoded->arg;




[H8300] Hookize GO_IF_MODE_DEPENDENT_ADDRESS

2011-03-30 Thread Anatoly Sokolov
Hello.

  This patch removes obsolete GO_IF_MODE_DEPENDENT_ADDRESS macros from H8300 
back end in the GCC and introduces equivalent TARGET_MODE_DEPENDENT_ADDRESS_P 
target hook.


  Regression tested on h8300-unknown-elf with no new failure.

  OK to install? 

* config/h8300/h8300.h (GO_IF_MODE_DEPENDENT_ADDRESS): Remove macro.
* config/h8300/h8300-protos.h (h8300_get_index): Remove.
* config/h8300/h8300.c (TARGET_MODE_DEPENDENT_ADDRESS_P): Define.
(h8300_mode_dependent_address_p): New function.
(h8300_get_index): Make static.

Index: gcc/config/h8300/h8300.c
===
--- gcc/config/h8300/h8300.c(revision 171675)
+++ gcc/config/h8300/h8300.c(working copy)
@@ -113,6 +113,7 @@
 static bool  h8300_short_move_mem_p   (rtx, enum rtx_code);
 static unsigned int  h8300_move_length(rtx *, const 
h8300_length_table *);
 static bool h8300_hard_regno_scratch_ok  (unsigned int);
+static rtx  h8300_get_index (rtx, enum machine_mode mode, int *);
 
 /* CPU_TYPE, says what cpu we're compiling for.  */
 int cpu_type;
@@ -2094,7 +2097,7 @@
MODE is the mode of the value being accessed.  It can be VOIDmode
if the address is known to be valid, but its mode is unknown.  */
 
-rtx
+static rtx
 h8300_get_index (rtx x, enum machine_mode mode, int *size)
 {
   int dummy, factor;
@@ -2156,6 +2159,21 @@
   return x;
 }
 
+/* Worker function for TARGET_MODE_DEPENDENT_ADDRESS_P.
+
+   On the H8/300, the predecrement and postincrement address depend thus
+   (the amount of decrement or increment being the length of the operand).  */
+
+static bool
+h8300_mode_dependent_address_p (const_rtx addr)
+{
+  if (GET_CODE (addr) == PLUS
+  && h8300_get_index (XEXP (addr, 0), VOIDmode, 0) != XEXP (addr, 0))
+return true;
+
+  return false;
+}
+
 static const h8300_length_table addb_length_table =
 {
   /* #xx  Rs   @aa  @Rs  @xx  */
@@ -6026,4 +6044,7 @@
 #undef TARGET_EXCEPT_UNWIND_INFO
 #define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info
 
+#undef TARGET_MODE_DEPENDENT_ADDRESS_P
+#define TARGET_MODE_DEPENDENT_ADDRESS_P h8300_mode_dependent_address_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
Index: gcc/config/h8300/h8300.h
===
--- gcc/config/h8300/h8300.h(revision 171675)
+++ gcc/config/h8300/h8300.h(working copy)
@@ -771,11 +771,6 @@
 
On the H8/300, the predecrement and postincrement address depend thus
(the amount of decrement or increment being the length of the operand).  */
-
-#define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL) \
-  if (GET_CODE (ADDR) == PLUS \
-  && h8300_get_index (XEXP (ADDR, 0), VOIDmode, 0) != XEXP (ADDR, 0)) \
-goto LABEL;
 
 /* Specify the machine mode that this machine uses
for the index in the tablejump instruction.  */
Index: gcc/config/h8300/h8300-protos.h
===
--- gcc/config/h8300/h8300-protos.h (revision 171675)
+++ gcc/config/h8300/h8300-protos.h (working copy)
@@ -110,7 +110,6 @@
 extern void h8300_pr_interrupt (struct cpp_reader *);
 extern void h8300_pr_saveall (struct cpp_reader *);
 extern enum reg_class  h8300_reg_class_from_letter (int);
-extern rtx h8300_get_index (rtx, enum machine_mode mode, int *);
 extern unsigned inth8300_insn_length_from_table (rtx, rtx *);
 extern const char *output_h8sx_shift (rtx *, int, int);
 extern boolh8300_operands_match_p (rtx *);


Anatoly.



Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Chung-Lin Tang
On 2011/3/30 05:28 PM, Richard Earnshaw wrote:
> 
> On Wed, 2011-03-30 at 15:35 +0800, Chung-Lin Tang wrote:
>> On 2011/3/30 上午 12:23, Richard Earnshaw wrote:
>>>
>>> On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
 On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
> On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
>> On 2011/3/24 06:51 PM, Richard Earnshaw wrote:
>>>
>>> On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
 Hi,
 PR48250 happens under TARGET_NEON, where DImode is included within the
 valid NEON modes. This turns the range of legitimate constant indexes 
 to
 step-4 (coproc load/store), thus arm_legitimize_reload_address() when
 trying to decompose the [reg+index] reload address into
 [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
 part is not aligned to 4.

 I'm not sure why the current DImode index is computed as:
 low = ((val & 0xf) ^ 0x8) - 0x8;  the sign-extending into negative
 values, then subtracting back, actually creates further off indexes.
 e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].

>>>
>>> Hysterical Raisins... the code there was clearly written for the days
>>> before we had LDRD in the architecture.  At that time the most efficient
>>> way to load a 64-bit object was to use the LDM{ia,ib,da,db}
>>> instructions.  The computation here was (I think), intended to try and
>>> make the most efficient use of an add/sub instruction followed by
>>> LDM/STM offsetting.  At that time the architecture had no unaligned
>>> access either, so dealing with 64-bit that were less than 32-bit aligned
>>> (in those days 32-bit was the maximum alignment) probably wasn't
>>> considered, or couldn't even get through to reload.
>>>
>>
>> I see it now. The code in output_move_double() returning assembly for
>> ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.
>>
>> I have changed the patch to let the new code handle the TARGET_LDRD case
>> only.  The pre-LDRD case is still handled by the original code, with an
>> additional & ~0x3 for aligning the offset to 4.
>>
>> I've also added a comment for the pre-TARGET_LDRD case. Please see if
>> the description is accurate enough.
>>
 My patch changes the index decomposing to a more straightforward way; 
 it
 also sort of outlines the way the other reload address indexes are
 broken by using and-masks, is not the most effective.  The address is
 computed by addition, subtracting away the parts to obtain low+high
 should be the optimal way of giving the largest computable index range.

 I have included a few Thumb-2 bits in the patch; I know currently
 arm_legitimize_reload_address() is only used under TARGET_ARM, but I
 guess it might eventually be turned into TARGET_32BIT.

>>>
>>> I think this needs to be looked at carefully on ARMv4/ARMv4T to check
>>> that it doesn't cause regressions there when we don't have LDRD in the
>>> instruction set.
>>
>> I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
>> Okay for trunk if no regressions?
>>
>> Thanks,
>> Chung-Lin
>>
>>  PR target/48250
>>  * config/arm/arm.c (arm_legitimize_reload_address): Adjust
>>  DImode constant index decomposing under TARGET_LDRD. Clear
>>  lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
>>  comment for !TARGET_LDRD case.
>
> This looks technically correct, but I can't help feeling that trying to
> deal with the bottom two bits being non-zero is not really worthwhile.
> This hook is an optimization that's intended to generate better code
> than the default mechanisms that reload provides.  It is allowed to
> fail, but it must say so if it does (by returning false).
>
> What this hook is trying to do for DImode is to take an address of the
> form (reg + TOO_BIG_CONST) and turn it into two instructions that are
> legitimate:
>
>   tmp = (REG + LEGAL_BIG_CONST)
>   some_use_of (mem (tmp + SMALL_LEGAL_CONST))
>
> The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
> instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
> impossible in ARM state) that pushing the bottom two bits of the address
> into LEGAL_BIG_CONST part of the offset is going to lead to a better
> code sequence.  
>
> So I think it would be more sensible to just return false if we have a
> DImode address with the bottom 2 bits non-zero and then let the normal
> reload recovery mechanism take over.

 It is supposed to provide better utilization of the full range of
 LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not su

Re: [PATCH] PR debug/47471 (set prologue_end in .debug_line)

2011-03-30 Thread Richard Henderson
On 03/30/2011 11:19 AM, Dodji Seketeli wrote:
> First, it avoids emitting two consecutive .loc that are identical.
> Strictly speaking that should fix this issue in this particular case.

What's the compatibility strategy?  I.e. how does gdb tell that we're
not using the double-loc mechanism?  Does it scan the line ops and if
you see any prologue_end marks assume that double-loc is not in effect?

Are you actually going to be able to delete any code within gdb, due
to the need to interpret older debug info?

> I have noticed that the end_prologue debug hook was being called, but
> the dwarf back-end hasn't implemented it (on non-vms platforms).  Is
> there a particular reason for not implementing this?  Also, I have
> noticed that the patch causes the prologue_end directive to be emitted
> even when we compile with optimizations.  I am not sure how right that
> would be.

Well, the main problem with the existing end_prologue hook is the
definition of the "end of the prologue".

The hook you see is called *after* the last insn that was emitted
by gen_prologue.

I believe that the hook you want would be called *before* the first
insn that was originally emitted after NOTE_INSN_FUNCTION_BEG.
(Note that under certain conditions, the parameters need to be spilled
to their canonical locations, and other similar "prologue-like" stuff
happens after gen_prologue, and before the "real" function begins.)

Stop me now if I'm mistaken in this understanding of what gdb needs.

Without instruction scheduling, and without the need for parameter
spills, etc, these two definitions are (nearly) the same.  I.e. the
existing hook would likely work for -O0 -g, for many parameter lists.

On the other end, the definition the compiler uses for "begin the
epilogue" is also *before* the first insn that was emitted by
gen_epilogue.  And again I believe that what you want is *after*
the last insn that had been part of the main function body.

One possibility for noticing the debugger-definition of end of
prologue is simply to mark the first line number of the function.
None of the gen_prologue or parameter spill instructions will have
location information attached (I believe).  Thus the first insn 
that does generate a line number ought to be the first "active"
insn of the function.

Presumably one could do something similar at the ends of the
function; though that scan may be a tad more complicated.

Another problem is that we're not supporting these other features
without assembler support.  Which is silly.  I'm going to go ahead
and re-vamp how we're recording line number tables for direct output.
That should allow us to generate more-or-less equivalently capable
.debug_line output by hand.


r~


Re: [PATCH, ARM] PR47855 Compute attr "length" for some thumb2 insns

2011-03-30 Thread Carrot Wei
Hi Ramana

On Wed, Mar 30, 2011 at 6:35 AM, Ramana Radhakrishnan
 wrote:
> Hi Carrot,
>
>
>        How about adding an alternative only enabled for T2 that uses the `l'
> constraint and inventing new constraints for some of the constant values
> that are valid for 16 bit instructions since the `I' and `L' constraints
> have different meanings depending on whether TARGET_32BIT is valid or not ?
> We could then set the value of the length attribute accordingly. I don't
> think we can change the meaning of the I and L constraints very easily given
> the amount of churn that might be needed
>
You are right. Now the logic is much clearer by splitting the constraints.
>
>        I don't think this and the other conditional branch instruction
> changes are correct. This could end up being the last instruction in an IT
> block and will automatically end up getting the 32 bit encoding and end up
> causing trouble with the length calculations. Remember the 16 bit encoding
> for the conditional instruction can't be used as the last instruction in an
> IT block.
>
According to arm architecture reference manual, chapter A8.6.16, neither 16 bit
nor 32 bit conditional branch can be used in IT block. Only unconditional branch
can be used as the last instruction in IT block, and it isn't related
to instruction
length. So we don't need to care about branch instruction encoding in IT block.

>
>        (i < num_saves && (hi_reg == 0)) - what's the point of going through
> the register list when hi_reg != 0 in this case ? A comment to explain that
> the length calculation *must* be kept in sync with the template above might
> be useful.
done.

The following patch has been tested on arm qemu.


ChangeLog:
2011-03-31  Wei Guozhi  

PR target/47855
* config/arm/arm.md (arm_cmpsi_insn): Compute attr "length".
(arm_cond_branch): Likewise.
(arm_cond_branch_reversed): Likewise.
(arm_jump): Likewise.
(push_multi): Likewise.
* config/arm/constraints.md (Py): New constraint.


Index: constraints.md
===
--- constraints.md  (revision 171337)
+++ constraints.md  (working copy)
@@ -31,7 +31,7 @@
 ;; The following multi-letter normal constraints have been used:
 ;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd
-;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px
+;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px, Py

 ;; The following memory constraints have been used:
 ;; in ARM/Thumb-2 state: Q, Ut, Uv, Uy, Un, Um, Us
@@ -189,6 +189,11 @@
   (and (match_code "const_int")
(match_test "TARGET_THUMB2 && ival >= -7 && ival <= -1")))

+(define_constraint "Py"
+  "@internal In Thumb-2 state a constant in the range 0 to 255"
+  (and (match_code "const_int")
+   (match_test "TARGET_THUMB2 && ival >= 0 && ival <= 255")))
+
 (define_constraint "G"
  "In ARM/Thumb-2 state a valid FPA immediate constant."
  (and (match_code "const_double")
Index: arm.md
===
--- arm.md  (revision 171337)
+++ arm.md  (working copy)
@@ -7109,13 +7109,21 @@

 (define_insn "*arm_cmpsi_insn"
   [(set (reg:CC CC_REGNUM)
-   (compare:CC (match_operand:SI 0 "s_register_operand" "r,r")
-   (match_operand:SI 1 "arm_add_operand""rI,L")))]
+   (compare:CC (match_operand:SI 0 "s_register_operand" "l,r,r,r")
+   (match_operand:SI 1 "arm_add_operand""Py,r,I,L")))]
   "TARGET_32BIT"
   "@
cmp%?\\t%0, %1
+   cmp%?\\t%0, %1
+   cmp%?\\t%0, %1
cmn%?\\t%0, #%n1"
-  [(set_attr "conds" "set")]
+  [(set_attr "conds" "set")
+   (set (attr "length")
+ (if_then_else
+   (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
+   (lt (symbol_ref "which_alternative") (const_int 2)))
+   (const_int 2)
+   (const_int 4)))]
 )

 (define_insn "*cmpsi_shiftsi"
@@ -7286,7 +7294,14 @@
   return \"b%d1\\t%l0\";
   "
   [(set_attr "conds" "use")
-   (set_attr "type" "branch")]
+   (set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else
+  (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
+   (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+(le (minus (match_dup 0) (pc)) (const_int 256
+  (const_int 2)
+  (const_int 4)))]
 )

 (define_insn "*arm_cond_branch_reversed"
@@ -7305,7 +7320,14 @@
   return \"b%D1\\t%l0\";
   "
   [(set_attr "conds" "use")
-   (set_attr "type" "branch")]
+   (set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else
+  (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
+   (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+(le (minus (match_dup 0) (pc)) (const_int 256
+  (const_int 2)
+  (const_int 4)))]
 )



@@ -7757,7 +7779,14 @@
 return \"b%?\\t%l0\";
   }
   "
-  [(set_attr "predicable

Random cleanups [4/4]: Streamlining streamer

2011-03-30 Thread Michael Matz
Hi,

I fear I wasn't as thorough in also splitting this one into several 
patches, but the different cleanups are at least mostly in different 
files.  They are:

* lto-lang remembers all builtin decls in a local list, to be returned
  by the getdecls langhook.  But as we have our own write_globals langhook
  this isn't actually called (except by dbxout.c), so there's no point in 
  remembering.

* lto.c:lto_materialize_function has code to read in the function body 
  sections, do something with them in non-wpa mode, and discard them then.
  There's no point in even reading them in in non-wpa mode (except for a 
  dubious error message that rather is worth an assert).

* gimple.c:gimple_type_leader_entry is a fixed length cache for speeding 
  up our type merging machinery.  It can hold references to many meanwhile 
  merged trees, interferring with the wish of free up much memory with a 
  ggc_collect with early-merging LTO.  We can simply make it deletable.

* ipa-inline.c: some tidying in not calling a macro with function call 
  arguments, and calling a costly function only after early-outs.

* lto-streamer-out.c : it writes out and compares strings character by 
  character.  memcmp and output_data_stream work just as well

* lto-streamer: output_unreferenced_globals writes out all global varpool 
  decls.  The reading side simply reads over all of them, and ignores 
  them.  This was supposed to help symbol resolution, and it probably once 
  did.  But since some time we properly emit varpool and cgraph nodes, and 
  references between them, and a proper symtab.  There's no need for 
  emitting these trees again.

* lto-streamer: the following changes the bytecode:
  1: all indices  into the cache are unsigned, hence we should say 
 so, instead of casting casts back and forth
  2: trees are only appended to the cache, when writing out.  When reading
 in we read in all trees in the stream one after the other, also 
 appending to the cache.  References to existing trees _always_ are to 
 - well - existing trees, hence to those already emitted earlier in 
 the stream, i.e. with a smaller offset, and more importantly with a 
 known index even at reader side.

 So, the offset never is used, so remove that and all associated 
 tracking and params.
  3: for the same reason we also don't need to stream the index that new
 trees get in the cache.  They will get exactly the ones they also had 
 when writing out.  We could use it as consistency check, but we 
 stream the expected tree-node for back-references for that already.

 Obviously we do need to stream the index in back references (aka 
 pickled references).

 (the index could change if there's a different set of nodes preloaded 
 into the cache between writing out and reading in.  But that would 
 have much worse problems already, silently overwriting slots with 
 trees from the stream; we should do away with the preloaded nodes,
 and instead rely on type merging to get canonical versions of the 
 builtin trees)

Not streaming offset and index for most trees obviously shortens the 
bytecode somewhat but I don't have statistics on how much.  Not much would 
be my guess.

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* lto-streamer.h (struct lto_streamer_cache_d): Remove offsets
and next_slot members.
(lto_streamer_cache_insert, lto_streamer_cache_insert_at,
lto_streamer_cache_lookup, lto_streamer_cache_get): Adjust prototypes.
(lto_streamer_cache_append): Declare.
* lto-streamer.c (lto_streamer_cache_add_to_node_array): Use
unsigned index, remove offset parameter, ensure that we append
or update existing entries.
(lto_streamer_cache_insert_1): Use unsigned index, remove offset_p
parameter, update next_slot for append.
(lto_streamer_cache_insert): Use unsigned index, remove offset_p
parameter.
(lto_streamer_cache_insert_at): Likewise.
(lto_streamer_cache_append): New function.
(lto_streamer_cache_lookup): Use unsigned index.
(lto_streamer_cache_get): Likewise.
(lto_record_common_node): Don't test tree_node_can_be_shared.
(preload_common_node): Adjust call to lto_streamer_cache_insert.
(lto_streamer_cache_delete): Don't free offsets member.
* lto-streamer-out.c (eq_string_slot_node): Use memcmp.
(lto_output_string_with_length): Use lto_output_data_stream.
(lto_output_tree_header): Remove ix parameter, don't write it.
(lto_output_builtin_tree): Likewise.
(lto_write_tree): Adjust callers to above, don't track and write
offset, write unsigned index.
(output_unreferenced_globals): Don't emit all global vars.
(write_global_references): Use unsigned indices.
(lto_output_decl_state_refs): Likewise.
(write_sym

[pph] Split streamer into pph-sreamer-out.c and pph-streamer-in.c (issue4333047)

2011-03-30 Thread Diego Novillo

No functional changes.  Just some refactoring to stop mixing up
reading and writing routines in the same file.

Tested on x86_64.

cp/ChangeLog.pph
2011-03-30  Diego Novillo  

* Make-lang.in (CXX_AND_OBJCXX_OBJS): Add cp/pph-streamer-out.o
and cp/pph-streamer-in.o.
(cp/pph-streamer-out.o): New.
(cp/pph-streamer-in.o): New.
* pph-streamer.c (current_pph_file): Move to cp/pph-streamer-out.c.
(pph_stream_init_write): Likewise.
(pph_stream_write_ld_base): Likewise.
(pph_stream_write_ld_min): Likewise.
(pph_stream_write_tree_vec): Likewise.
(pph_stream_write_cxx_binding_1): Likewise.
(pph_stream_write_cxx_binding): Likewise.
(pph_stream_write_class_binding): Likewise.
(pph_stream_write_label_binding): Likewise.
(pph_stream_write_binding_level): Likewise.
(pph_stream_write_c_language_function): Likewise.
(pph_stream_write_language_function): Likewise.
(pph_stream_write_ld_fn): Likewise.
(pph_stream_write_ld_ns): Likewise.
(pph_stream_write_ld_parm): Likewise.
(pph_stream_write_lang_specific_data): Likewise.
(pph_stream_write_tree): Likewise.
(pph_stream_pack_value_fields): Likewise.
(pph_stream_begin_section): Likewise.
(pph_stream_write): Likewise.
(pph_stream_end_section): Likewise.
(pph_stream_write_header): Likewise.
(pph_stream_write_body): Likewise.
(pp_stream_flush_buffers): New.  Factor out of pph_stream_close.
(pph_get_section_data): Move to cp/pph-streamer-in.c.
(pph_free_section_data): Likewise.
(pph_stream_init_read): Likewise.
(pph_stream_read_tree): Likewise.
(pph_stream_unpack_value_fields): Likewise.
* pph-streamer.h (pph_stream_flush_buffers): Declare.
(pph_stream_init_write): Declare.
(pph_stream_write_tree): Declare.
(pph_stream_pack_value_fields): Declare.
(pph_stream_init_read): Declare.
(pph_stream_read_tree): Declare.
(pph_stream_unpack_value_fields): Declare.
* pph-streamer-out.c: New file.
* pph-streamer-in.c: New file.

diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 70548c6..143df46 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -81,7 +81,8 @@ CXX_AND_OBJCXX_OBJS = cp/call.o cp/decl.o cp/expr.o cp/pt.o 
cp/typeck2.o \
  cp/typeck.o cp/cvt.o cp/except.o cp/friend.o cp/init.o cp/method.o \
  cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o \
  cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o \
- cp/cp-gimplify.o cp/pph.o cp/pph-streamer.o tree-mudflap.o $(CXX_C_OBJS)
+ cp/cp-gimplify.o cp/pph.o cp/pph-streamer.o cp/pph-streamer-out.o \
+ cp/pph-streamer-in.o tree-mudflap.o $(CXX_C_OBJS)
 
 # Language-specific object files for C++.
 CXX_OBJS = cp/cp-lang.o c-family/stub-objc.o $(CXX_AND_OBJCXX_OBJS)
@@ -337,3 +338,11 @@ cp/pph.o: cp/pph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
$(CPPLIB_H) \
 cp/pph-streamer.o: cp/pph-streamer.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) $(CXX_PPH_STREAMER_H) \
$(CXX_PPH_H) $(TREE_PASS_H) version.h cppbuiltin.h tree-iterator.h
+cp/pph-streamer-out.o: cp/pph-streamer-out.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) \
+   $(CXX_PPH_STREAMER_H) $(CXX_PPH_H) $(TREE_PASS_H) version.h \
+   cppbuiltin.h tree-iterator.h
+cp/pph-streamer-in.o: cp/pph-streamer-in.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TREE_H) tree-pretty-print.h $(LTO_STREAMER_H) \
+   $(CXX_PPH_STREAMER_H) $(CXX_PPH_H) $(TREE_PASS_H) version.h \
+   cppbuiltin.h tree-iterator.h
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
new file mode 100644
index 000..aaa0a6c
--- /dev/null
+++ b/gcc/cp/pph-streamer-in.c
@@ -0,0 +1,155 @@
+/* Routines for reading PPH data.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Diego Novillo .
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "langhooks.h"
+#include "tree-iterator.h"
+#include "tree-pretty-print.h"
+#include "lto-streamer.h"
+#include "pph-streamer.h"
+#include "pph.h"
+

Re: Random cleanups [3/4]: zero out DECL_VINDEX field

2011-03-30 Thread Diego Novillo
On Wed, Mar 30, 2011 at 21:32, Michael Matz  wrote:

>        * tree.c (free_lang_data_in_decl): Zero DECL_VINDEX if it's not
>        an integer.
>        * tree.h (tree_decl_non_common.vindex): Adjust comment.

OK.


Diego.


Random cleanups [3/4]: zero out DECL_VINDEX field

2011-03-30 Thread Michael Matz
Hi,

I noticed this while working on early-merging LTO.  The DECL_VINDEX slot 
of FUNCTION_DECLs is supposed to hold the numeric index of the vtable 
slot if it's a virtual function.  During parsing the C++ frontend uses it 
to hold a reference to itself, which then later is supposed to be 
transformed into the real index.  Sometimes the C++ frontend copies 
function_decls, leaving the old one reachable via various means (but not 
as a reachable function, merely in scopes and the like), but fixes up 
only the copy.

Instead of diddling with the frontend we can just as well zero out 
meaningless DECL_VINDEX slots (i.e. when they're not an integer).  It's 
used for debug info generation (when it's an integer) and as a flag ala 
"this function is a virtual method" by the middle-end.

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* tree.c (free_lang_data_in_decl): Zero DECL_VINDEX if it's not
an integer.
* tree.h (tree_decl_non_common.vindex): Adjust comment.

Index: tree.c
===
--- tree.c  (revision 171537)
+++ tree.c  (working copy)
@@ -4647,6 +4647,13 @@ free_lang_data_in_decl (tree decl)
  && RECORD_OR_UNION_TYPE_P
   (DECL_CONTEXT (DECL_ABSTRACT_ORIGIN (decl
DECL_ABSTRACT_ORIGIN (decl) = NULL_TREE;
+
+  /* Sometimes the C++ frontend doesn't manage to transform a temporary
+ DECL_VINDEX referring to itself into a vtable slot number as it
+should.  Happens with functions that are copied and then forgotten
+about.  Just clear it, it won't matter anymore.  */
+  if (DECL_VINDEX (decl) && !host_integerp (DECL_VINDEX (decl), 0))
+   DECL_VINDEX (decl) = NULL_TREE;
 }
   else if (TREE_CODE (decl) == VAR_DECL)
 {
Index: tree.h
===
--- tree.h  (revision 171537)
+++ tree.h  (working copy)
@@ -3228,7 +3233,7 @@ struct GTY(())
   tree arguments;
   /* Almost all FE's use this.  */
   tree result;
-  /* C++ uses this in namespaces.  */
+  /* C++ uses this in namespaces and function_decls.  */
   tree vindex;
 };
 


Re: Random cleanups [1/4]

2011-03-30 Thread Diego Novillo
On Wed, Mar 30, 2011 at 21:16, Michael Matz  wrote:

>        * tree.c (decl_init_priority_insert): Don't create entry for
>        default priority.
>        (decl_fini_priority_insert): Ditto.
>        (fields_compatible_p, find_compatible_field): Remove.
>        * tree.h (fields_compatible_p, find_compatible_field): Remove.
>        * gimple.c (gimple_compare_field_offset): Adjust block comment.

OK.


Diego.


Random cleanups [2/4]: canonicalize ctor values

2011-03-30 Thread Michael Matz
Hi,

this came up when looking into why the static ctors contain useless trees 
(like casts).  We can simply canonicalize them while varpool analyzes 
pending decls.  It'll look at initialzers once, where we can "gimplify" 
them.  This requires making canonicalize_constructor_val be able to be 
called outside of functions.  And it requires the java frontend not 
leaving a dangling function decl as current (for the static ctor function 
it generates).

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.
-- 
* cgraphbuild.c (record_reference): Canonicalize constructor
values.
* gimple-fold.c (canonicalize_constructor_val): Accept being called
without function context.

java/
* jcf-parse.c (java_emit_static_constructor): Clear cfun and
current_function_decl.

Index: cgraphbuild.c
===
--- cgraphbuild.c   (revision 171537)
+++ cgraphbuild.c   (working copy)
@@ -53,6 +53,8 @@ record_reference (tree *tp, int *walk_su
   tree decl;
   struct record_reference_ctx *ctx = (struct record_reference_ctx *)data;
 
+restart:
+
   switch (TREE_CODE (t))
 {
 case VAR_DECL:
@@ -98,6 +100,15 @@ record_reference (tree *tp, int *walk_su
  break;
}
 
+  t = canonicalize_constructor_val (t);
+  if (t && t != *tp)
+   {
+ *tp = t;
+ goto restart;
+   }
+  else
+   t = *tp;
+
   if ((unsigned int) TREE_CODE (t) >= LAST_AND_UNUSED_TREE_CODE)
return lang_hooks.callgraph.analyze_expr (tp, walk_subtrees);
   break;
Index: gimple-fold.c
===
--- gimple-fold.c   (revision 171537)
+++ gimple-fold.c   (working copy)
@@ -106,7 +106,7 @@ can_refer_decl_in_current_unit_p (tree d
   return true;
 }
 
-/* CVAL is value taken from DECL_INITIAL of variable.  Try to transorm it into
+/* CVAL is value taken from DECL_INITIAL of variable.  Try to transform it into
acceptable form for is_gimple_min_invariant.   */
 
 tree
@@ -131,7 +131,7 @@ canonicalize_constructor_val (tree cval)
  || TREE_CODE (base) == FUNCTION_DECL)
  && !can_refer_decl_in_current_unit_p (base))
return NULL_TREE;
-  if (base && TREE_CODE (base) == VAR_DECL)
+  if (cfun && base && TREE_CODE (base) == VAR_DECL)
add_referenced_var (base);
   /* We never have the chance to fixup types in global initializers
  during gimplification.  Do so here.  */
Index: java/jcf-parse.c
===
--- java/jcf-parse.c.orig   2011-03-26 02:19:03.0 +0100
+++ java/jcf-parse.c2011-03-28 06:16:43.0 +0200
@@ -1725,6 +1725,8 @@ java_emit_static_constructor (void)
   DECL_STATIC_CONSTRUCTOR (decl) = 1;
   java_genericize (decl);
   cgraph_finalize_function (decl, false);
+  current_function_decl = NULL;
+  set_cfun (NULL);
 }
 }
 


Random cleanups [1/4]

2011-03-30 Thread Michael Matz
Hi,

in order to reduce the size of my LTO early-merging patch I'll post some 
cleanups separately.  This one removes two functions that are not called 
from anywhere and makes not allocate priority structs for all decls that 
set them only to the DEFAULT_INIT_PRIORITY.

Regstrapped on x86_64-linux together with the next three cleanup patches.  
Okay for trunk?


Ciao,
Michael.
-- 
* tree.c (decl_init_priority_insert): Don't create entry for
default priority.
(decl_fini_priority_insert): Ditto.
(fields_compatible_p, find_compatible_field): Remove.
* tree.h (fields_compatible_p, find_compatible_field): Remove.
* gimple.c (gimple_compare_field_offset): Adjust block comment.

Index: tree.c
===
--- tree.c  (revision 171537)
+++ tree.c  (working copy)
@@ -5913,6 +5920,8 @@ decl_init_priority_insert (tree decl, pr
   struct tree_priority_map *h;
 
   gcc_assert (VAR_OR_FUNCTION_DECL_P (decl));
+  if (priority == DEFAULT_INIT_PRIORITY)
+return;
   h = decl_priority_info (decl);
   h->init = priority;
 }
@@ -5925,6 +5934,8 @@ decl_fini_priority_insert (tree decl, pr
   struct tree_priority_map *h;
 
   gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
+  if (priority == DEFAULT_INIT_PRIORITY)
+return;
   h = decl_priority_info (decl);
   h->fini = priority;
 }
@@ -9893,50 +9904,6 @@ needs_to_live_in_memory (const_tree t)
  && aggregate_value_p (t, current_function_decl)));
 }
 
-/* There are situations in which a language considers record types
-   compatible which have different field lists.  Decide if two fields
-   are compatible.  It is assumed that the parent records are compatible.  */
-
-bool
-fields_compatible_p (const_tree f1, const_tree f2)
-{
-  if (!operand_equal_p (DECL_FIELD_BIT_OFFSET (f1),
-   DECL_FIELD_BIT_OFFSET (f2), OEP_ONLY_CONST))
-return false;
-
-  if (!operand_equal_p (DECL_FIELD_OFFSET (f1),
-DECL_FIELD_OFFSET (f2), OEP_ONLY_CONST))
-return false;
-
-  if (!types_compatible_p (TREE_TYPE (f1), TREE_TYPE (f2)))
-return false;
-
-  return true;
-}
-
-/* Locate within RECORD a field that is compatible with ORIG_FIELD.  */
-
-tree
-find_compatible_field (tree record, tree orig_field)
-{
-  tree f;
-
-  for (f = TYPE_FIELDS (record); f ; f = TREE_CHAIN (f))
-if (TREE_CODE (f) == FIELD_DECL
-   && fields_compatible_p (f, orig_field))
-  return f;
-
-  /* ??? Why isn't this on the main fields list?  */
-  f = TYPE_VFIELD (record);
-  if (f && TREE_CODE (f) == FIELD_DECL
-  && fields_compatible_p (f, orig_field))
-return f;
-
-  /* ??? We should abort here, but Java appears to do Bad Things
- with inherited fields.  */
-  return orig_field;
-}
-
 /* Return value of a constant X and sign-extend it.  */
 
 HOST_WIDE_INT
Index: tree.h
===
--- tree.h  (revision 171537)
+++ tree.h  (working copy)
@@ -5227,9 +5232,6 @@ extern bool subrange_type_for_debug_p (c
 extern HOST_WIDE_INT int_cst_value (const_tree);
 extern HOST_WIDEST_INT widest_int_cst_value (const_tree);
 
-extern bool fields_compatible_p (const_tree, const_tree);
-extern tree find_compatible_field (tree, tree);
-
 extern tree *tree_block (tree);
 extern location_t *block_nonartificial_location (tree);
 extern location_t tree_nonartificial_location (tree);
Index: gimple.c
===
--- gimple.c.orig   2011-03-28 03:55:39.0 +0200
+++ gimple.c2011-03-28 03:56:13.0 +0200
@@ -3306,8 +3306,7 @@ compare_type_names_p (tree t1, tree t2,
 
 /* Return true if the field decls F1 and F2 are at the same offset.
 
-   This is intended to be used on GIMPLE types only.  In order to
-   compare GENERIC types, use fields_compatible_p instead.  */
+   This is intended to be used on GIMPLE types only.  */
 
 bool
 gimple_compare_field_offset (tree f1, tree f2)


a patch to fix PR 48367

2011-03-30 Thread Vladimir Makarov
The following patch solves the problem.  The reason for the problem is 
described on


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48367

The patch has been committed as obvious.

2011-03-30  Vladimir Makarov 

PR middle-end/48367
* ira-costs.c (find_costs_and_classes): Fix a typo in i_mem_cost
calculation.

Index: ira-costs.c
===
--- ira-costs.c (revision 171649)
+++ ira-costs.c (working copy)
@@ -1652,7 +1652,7 @@ find_costs_and_classes (FILE *dump_file)
i_costs[k] += add_cost;
}
  add_cost = COSTS (costs, a_num)->mem_cost;
- if (add_cost && INT_MAX - add_cost < i_mem_cost)
+ if (add_cost > 0 && INT_MAX - add_cost < i_mem_cost)
i_mem_cost = INT_MAX;
  else
i_mem_cost += add_cost;
@@ -1887,7 +1887,7 @@ process_bb_node_for_hard_reg_moves (ira_
   ALLOCNO_HARD_REG_COSTS (a)[i] -= cost;
   ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[i] -= cost;
   ALLOCNO_CLASS_COST (a) = MIN (ALLOCNO_CLASS_COST (a),
- ALLOCNO_HARD_REG_COSTS (a)[i]);
+   ALLOCNO_HARD_REG_COSTS (a)[i]);
 }
 }




Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Hans-Peter Nilsson
On Tue, 29 Mar 2011, Joseph S. Myers wrote:
> Other cleanups here:

> cris*-*-none acts just like cris*-*-elf in
> config.gcc so it's appropriate to make the "*" subcase of cris*-*-* act
> like the -elf case;

'k.

> mmix-*-* disabled "libgloss", i.e. libgloss for the
> host, which is never built anyway.

Whoops, my bad.  Odd that it worked anyway, but I'm not certain
I added that due to actual breakage.

All bits fine by me.

brgds, H-P


Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/30/11 10:51, Richard Sandiford wrote:
> Nice cleanup thanks.  Just noticed a couple of things:
> 
> Jeff Law  writes:
>> *** struct reload
>> *** 100,106 
>> int inc;
>> /* A reg for which reload_in is the equivalent.
>>If reload_in is a symbol_ref which came from
>> !  reg_equiv_constant, then this is the pseudo
>>which has that symbol_ref as equivalent.  */
>> rtx in_reg;
>> rtx out_reg;
>> --- 100,106 
>> int inc;
>> /* A reg for which reload_in is the equivalent.
>>If reload_in is a symbol_ref which came from
>> !  reg_equiv_consant, then this is the pseudo
>>which has that symbol_ref as equivalent.  */
>> rtx in_reg;
>> rtx out_reg;
> 
> Adds typo.
Yea.  I had changed the comment when I was using VEC_blah directly, then
introduced the typo when I went to using accessor macros and wanted to
change the comment back to its original form :(

I just checked in a fix for the typo.

> 
>> *** elimination_effects (rtx x, enum machine
>> *** 3002,3011 
>>}
>>   
>>  }
>> !   else if (reg_renumber[regno] < 0 && reg_equiv_constant
>> !   && reg_equiv_constant[regno]
>> !   && ! function_invariant_p (reg_equiv_constant[regno]))
>> !elimination_effects (reg_equiv_constant[regno], mem_mode);
>> return;
>>   
>>   case PRE_INC:
>> --- 2996,3006 
>>}
>>   
>>  }
>> !   else if (reg_renumber[regno] < 0
>> !   && reg_equiv_constant (0)
>> !   && reg_equiv_constant (regno)
>> !   && ! function_invariant_p (reg_equiv_constant (regno)))
>> !elimination_effects (reg_equiv_constant (regno), mem_mode);
>> return;
>>   
>>   case PRE_INC:
> 
> Looks like this should be s/reg_equiv_constant (0)/reg_equivs != 0/.
I thought I'd fixed these.  I certainly remember looking at them and
thinking that can't be right at some point.  I'll look at it again more
closely tomorrow and take appropriate corrective action.

Thanks for the feedback,
Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNk8ZSAAoJEBRtltQi2kC7+qYH/R0i1/YC3efnLuQjZ0uieuCU
0b/sMDvP+0xngPbKKn9YviBuNhUv2/poPG1OOOQtolHeZ5c8rftecZKMRgjtmX9W
jYxhuY2OLBBTLLPo4eFDdBbEP/m90RGEBtumeIx1isPTOjQ3LVuBF+9GB+wbnr/W
OLox1MSfPT7GV3pbBeSMiHiKkw5VOeFKd4vBIbefWAPgjO0G8LFexBYWkw04j1F5
tXbuLn/cnSb4PLoRgrCxDv4XS8Wzx1YJcFtcnjn+a2t1HPUARkSYCSn8o94Yytub
k263YKwgRsqwxnrXmGght4+K2kuVjitANa6iX24DWTa+eJBruQ9ObA/f3Xs1P5E=
=SgX4
-END PGP SIGNATURE-


PR bootstrap/48371

2011-03-30 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sigh.  I changed a reg_equiv_address to reg_equiv_mem when I changed
from explicit VEC_index references to macros.  That caused all kinds of
interesting problems for ia32 :-)

I'm also fixing a comment typo that was introduced around the same time.

Installed as obvious.  Fixes 48371.

x86_64-unknown-linux-gnu bootstrapped and regression tested.  i686
bootstrap in progress (and probably won't finish until tomorrow...)


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNk8UuAAoJEBRtltQi2kC77goIAISRCD1qP9OA8KS1WPYtocFo
ZdxfOX0RA1tHZclEVPQbqkWaGxXW7jJbf1Wz8rFsMgQpeXv328NNDgfgmLTMhw6o
jxhuOjvNX7rSDYeKGjPBuU+K37co+oVaHi9x8X0ZhOkYUHG9u+hKLlLXUV9h5AGL
0RNY71nfZD4y7y4F6xAqFPGlnh8HCzXTaUbzTXXyEcTo3WH1rWO2HcZhL/5MJ2SR
aaoF0lbIC4GAm2kUXUYeSV1JYlzc7ZUZrnjZwKwFiK4UVcF90nJr6Ji7bGEbuRAN
3D0U7Lmda8tWZ0gzkhWlcwx4Ys0gBBoHJowDFXY9hvkff7FABa+raUsEkyO2rTc=
=dWZc
-END PGP SIGNATURE-
* reload1.c (reload): Fix botch in last change.
* reload.h (struct reload): Fix typo introduced in last change.

Index: reload1.c
===
*** reload1.c   (revision 171731)
--- reload1.c   (working copy)
*** reload (rtx first, int global)
*** 895,901 
 && (REGNO (XEXP (XEXP (x, 0), 0))
 < FIRST_PSEUDO_REGISTER)
 && CONSTANT_P (XEXP (XEXP (x, 0), 1
! reg_equiv_mem (i) = XEXP (x, 0), reg_equiv_mem (i) = 0;
else
  {
/* Make a new stack slot.  Then indicate that something
--- 895,901 
 && (REGNO (XEXP (XEXP (x, 0), 0))
 < FIRST_PSEUDO_REGISTER)
 && CONSTANT_P (XEXP (XEXP (x, 0), 1
! reg_equiv_address (i) = XEXP (x, 0), reg_equiv_mem (i) = 0;
else
  {
/* Make a new stack slot.  Then indicate that something
Index: reload.h
===
*** reload.h(revision 171731)
--- reload.h(working copy)
*** struct reload
*** 100,106 
int inc;
/* A reg for which reload_in is the equivalent.
   If reload_in is a symbol_ref which came from
!  reg_equiv_consant, then this is the pseudo
   which has that symbol_ref as equivalent.  */
rtx in_reg;
rtx out_reg;
--- 100,106 
int inc;
/* A reg for which reload_in is the equivalent.
   If reload_in is a symbol_ref which came from
!  reg_equiv_constant, then this is the pseudo
   which has that symbol_ref as equivalent.  */
rtx in_reg;
rtx out_reg;


[pph] Fix some names (issue4277091)

2011-03-30 Thread Diego Novillo

I had misnamed the functions to convert output_block to pph_stream.

cp/ChangeLog.pph
2011-03-30  Diego Novillo  

* pph-streamer.h (pph_get_pph_stream): Rename from
pph_get_ob_stream.  Update all users.
(pph_set_pph_stream): Rename from pph_set_pph_stream.  Update
all users.
(pph_output_tree): Remove ATTRIBUTE_UNUSED.
(pph_input_tree): Remove ATTRIBUTE_UNUSED.

diff --git a/gcc/cp/pph-streamer.c b/gcc/cp/pph-streamer.c
index 125c4e7..e7f0423 100644
--- a/gcc/cp/pph-streamer.c
+++ b/gcc/cp/pph-streamer.c
@@ -135,7 +135,7 @@ pph_stream_init_write (pph_stream *stream)
   lto_push_out_decl_state (stream->out_state);
   stream->decl_state_stream = XCNEW (struct lto_output_stream);
   stream->ob = create_output_block (LTO_section_decls);
-  pph_set_ob_stream (stream->ob, stream);
+  pph_set_pph_stream (stream->ob, stream);
 }
 
 
@@ -148,7 +148,7 @@ pph_stream_write_ld_base (struct output_block *ob, struct 
lang_decl_base *ldb)
 
   if (ldb == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -178,7 +178,7 @@ pph_stream_write_ld_min (struct output_block *ob, struct 
lang_decl_min *ldm,
 {
   if (ldm == 0)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -224,7 +224,7 @@ pph_stream_write_cxx_binding_1 (struct output_block *ob, 
cxx_binding *cb,
 
   if (cb == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -250,7 +250,7 @@ pph_stream_write_cxx_binding (struct output_block *ob, 
cxx_binding *cb,
 
   if (cb == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -276,7 +276,7 @@ pph_stream_write_class_binding (struct output_block *ob,
 {
   if (cb == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -294,7 +294,7 @@ pph_stream_write_label_binding (struct output_block *ob,
 {
   if (lb == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -317,7 +317,7 @@ pph_stream_write_binding_level (struct output_block *ob,
 
   if (bl == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -368,7 +368,7 @@ pph_stream_write_c_language_function (struct output_block 
*ob,
 {
   if (clf == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -390,7 +390,7 @@ pph_stream_write_language_function (struct output_block *ob,
 
   if (lf == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -431,7 +431,7 @@ pph_stream_write_ld_fn (struct output_block *ob, struct 
lang_decl_fn *ldf,
 
   if (ldf == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -465,7 +465,7 @@ pph_stream_write_ld_fn (struct output_block *ob, struct 
lang_decl_fn *ldf,
 gcc_unreachable ();
 
   if (ldf->pending_inline_p == 1)
-pth_save_token_cache (ldf->u.pending_inline_info, pph_get_ob_stream (ob));
+pth_save_token_cache (ldf->u.pending_inline_info, pph_get_pph_stream (ob));
   else if (ldf->pending_inline_p == 0)
 pph_stream_write_language_function (ob, ldf->u.saved_language_function,
ref_p);
@@ -483,7 +483,7 @@ pph_stream_write_ld_ns (struct output_block *ob, struct 
lang_decl_ns *ldns,
 
   if (ldns == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
@@ -500,7 +500,7 @@ pph_stream_write_ld_parm (struct output_block *ob, struct 
lang_decl_parm *ldp)
 {
   if (ldp == NULL)
 {
-  pph_output_uint (pph_get_ob_stream (ob), 0);
+  pph_output_uint (pph_get_pph_stream (ob), 0);
   return;
 }
 
diff --git a/gcc/cp/pph-streamer.h b/gcc/cp/pph-streamer.h
index 5d888b3..9fb7ac3 100644
--- a/gcc/cp/pph-streamer.h
+++ b/gcc/cp/pph-streamer.h
@@ -100,7 +100,7 @@ extern void pth_save_token_cache (struct cp_token_cache *, 
pph_stream *);
 
 /* Output AST T to STREAM.  */
 static inline void
-pph_output_tree (pph_stream *stream ATTRIBUTE_UNUSED, tree t ATTRIBUTE_UNUSED)
+pph_output_tree (pph_stream *stream, tree t)
 {
   if (flag_pph_tracer)
 pph_stream_trace_tree (stream, t);
@@ -190,7 +190,7 @@ pph_input_string (pph_stream *stream)
 /* Load an AST from STREAM.  Return the corresponding tree.  */
 
 static inline tree
-pph_input_tree (pph_stream *

[x86] Fix PR target/48142

2011-03-30 Thread Eric Botcazou
Hi,

this is a regression present for x86-64 on mainline and 4.6 branch with the 
options -Os -mpreferred-stack-boundary=5 -fstack-check -fno-omit-frame-pointer.
This improbable combination of options is necessary because you need to have 
stack checking + stack realignment + !ACCUMULATE_OUTGOING_ARGS.  In this case, 
the DW_CFA_GNU_args_size CFIs must be correct in spite of the frame pointer.

Tested on {i586,x86_64}-suse-linux, OK for mainline and 4.6 branch?


2011-03-30  Eric Botcazou  

PR target/48142
* config/i386/i386.c (ix86_adjust_stack_and_probe): Differentiate
frame-related from frame-unrelated adjustments to the stack pointer.


2011-03-30  Eric Botcazou  

* g++.dg/other/pr48142.C: New test.


-- 
Eric Botcazou
// PR target/48142
// Testcase by Zdenek Sojka 

// { dg-do run { target i?86-*-* x86_64-*-* } }
// { dg-options "-Os -mpreferred-stack-boundary=5 -fstack-check -fno-omit-frame-pointer" }

int main()
{
  try { throw 0; }
  catch (...) {}
  return 0;
}
Index: config/i386/i386.c
===
--- config/i386/i386.c	(revision 171716)
+++ config/i386/i386.c	(working copy)
@@ -10006,7 +10006,7 @@ ix86_adjust_stack_and_probe (const HOST_
  probe that many bytes past the specified size to maintain a protection
  area at the botton of the stack.  */
   const int dope = 4 * UNITS_PER_WORD;
-  rtx size_rtx = GEN_INT (size);
+  rtx size_rtx = GEN_INT (size), last;
 
   /* See if we have a constant small number of probes to generate.  If so,
  that's the easy case.  The run-time loop is made up of 11 insns in the
@@ -10046,9 +10046,9 @@ ix86_adjust_stack_and_probe (const HOST_
   emit_stack_probe (stack_pointer_rtx);
 
   /* Adjust back to account for the additional first interval.  */
-  emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
-			  plus_constant (stack_pointer_rtx,
-	 PROBE_INTERVAL + dope)));
+  last = emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+ plus_constant (stack_pointer_rtx,
+		PROBE_INTERVAL + dope)));
 }
 
   /* Otherwise, do the same as above, but in a loop.  Note that we must be
@@ -10109,15 +10109,33 @@ ix86_adjust_stack_and_probe (const HOST_
 	}
 
   /* Adjust back to account for the additional first interval.  */
-  emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
-			  plus_constant (stack_pointer_rtx,
-	 PROBE_INTERVAL + dope)));
+  last = emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+ plus_constant (stack_pointer_rtx,
+		PROBE_INTERVAL + dope)));
 
   release_scratch_register_on_entry (&sr);
 }
 
   gcc_assert (cfun->machine->fs.cfa_reg != stack_pointer_rtx);
-  cfun->machine->fs.sp_offset += size;
+
+  /* Even if the stack pointer isn't the CFA register, we need to correctly
+ describe the adjustments made to it, in particular differentiate the
+ frame-related ones from the frame-unrelated ones.  */
+  if (size > 0)
+{
+  rtx expr = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
+  XVECEXP (expr, 0, 0)
+	= gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+		   plus_constant (stack_pointer_rtx, -size));
+  XVECEXP (expr, 0, 1)
+	= gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+		   plus_constant (stack_pointer_rtx,
+  PROBE_INTERVAL + dope + size));
+  add_reg_note (last, REG_FRAME_RELATED_EXPR, expr);
+  RTX_FRAME_RELATED_P (last) = 1;
+
+  cfun->machine->fs.sp_offset += size;
+}
 
   /* Make sure nothing is scheduled before we are done.  */
   emit_insn (gen_blockage ());


Re: [libgo] Replace wait4 by waitpid (PR go/47515)

2011-03-30 Thread Ian Lance Taylor
Rainer Orth  writes:

> Currently, libgo uses wait4 unconditionally, which is missing on IRIX
> 6.5.  Fortunately, the rusage * arg is used nowhere, so I've decide to
> replace wait4 by waitpid, which seems to be considerably more portable.

Thanks for the patch, but I wasn't entirely happy with the approach
because it removes the os.Waitmsg.Rusage field.  That field is
inherently system dependent but I would rather not get rid of it.  I
wrote this patch instead, which I hope will also solve the problem.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r a602eae61cf4 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 16:01:55 2011 -0700
@@ -1246,13 +1246,20 @@
 endif # !LIBGO_IS_LINUX
 
 
-# Define ForkExec, PtraceForkExec, Exec, and Wait4.
+# Define ForkExec, PtraceForkExec, and Exec.
 if LIBGO_IS_RTEMS
 syscall_exec_os_file = syscalls/exec_stubs.go
 else
 syscall_exec_os_file = syscalls/exec.go
 endif
 
+# Define Wait4.
+if HAVE_WAIT4
+syscall_wait_file = syscalls/wait4.go
+else
+syscall_wait_file = syscalls/waitpid.go
+endif
+
 # Define Sleep.
 if LIBGO_IS_RTEMS
 syscall_sleep_file = syscalls/sleep_rtems.go
@@ -1329,6 +1336,7 @@
 	$(syscall_errstr_decl_file) \
 	syscalls/exec_helpers.go \
 	$(syscall_exec_os_file) \
+	$(syscall_wait_file) \
 	$(syscall_filesize_file) \
 	$(syscall_stat_file) \
 	$(syscall_sleep_file) \
diff -r a602eae61cf4 libgo/configure.ac
--- a/libgo/configure.ac	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/configure.ac	Wed Mar 30 16:01:55 2011 -0700
@@ -381,8 +381,9 @@
 AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h)
 AM_CONDITIONAL(HAVE_SYS_MMAN_H, test "$ac_cv_header_sys_mman_h" = yes)
 
-AC_CHECK_FUNCS(srandom random strerror_r strsignal)
+AC_CHECK_FUNCS(srandom random strerror_r strsignal wait4)
 AM_CONDITIONAL(HAVE_STRERROR_R, test "$ac_cv_func_strerror_r" = yes)
+AM_CONDITIONAL(HAVE_WAIT4, test "$ac_cv_func_wait4" = yes)
 
 AC_CACHE_CHECK([for __sync_bool_compare_and_swap_4],
 [libgo_cv_func___sync_bool_compare_and_swap_4],
diff -r a602eae61cf4 libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/mksysinfo.sh	Wed Mar 30 16:01:55 2011 -0700
@@ -377,6 +377,8 @@
 nrusage="$nrusage $field;"
   done
   echo "type Rusage struct {$nrusage }" >> ${OUT}
+else
+  echo "type Rusage struct {}" >> ${OUT}
 fi
 
 # The utsname struct.
diff -r a602eae61cf4 libgo/syscalls/exec.go
--- a/libgo/syscalls/exec.go	Wed Mar 30 15:33:32 2011 -0700
+++ b/libgo/syscalls/exec.go	Wed Mar 30 16:01:55 2011 -0700
@@ -17,7 +17,6 @@
 func libc_dup2(int, int) int __asm__ ("dup2")
 func libc_execve(*byte, **byte, **byte) int __asm__ ("execve")
 func libc_sysexit(int) __asm__ ("_exit")
-func libc_wait4(Pid_t, *int, int, *Rusage) Pid_t __asm__ ("wait4")
 
 // Fork, dup fd onto 0..len(fd), and exec(argv0, argvv, envv) in child.
 // If a dup or exec fails, write the errno int to pipe.
@@ -263,16 +262,3 @@
 	libc_execve(StringBytePtr(argv0), &argv_arg[0], &envv_arg[0])
 	return GetErrno()
 }
-
-func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
-	var status int
-	r := libc_wait4(Pid_t(pid), &status, options, rusage)
-	wpid = int(r)
-	if r < 0 {
-		errno = GetErrno()
-	}
-	if wstatus != nil {
-		*wstatus = WaitStatus(status)
-	}
-	return
-}
diff -r a602eae61cf4 libgo/syscalls/wait4.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/wait4.go	Wed Mar 30 16:01:55 2011 -0700
@@ -0,0 +1,22 @@
+// wait4.go -- Wait4 for systems with wait4.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+func libc_wait4(Pid_t, *int, int, *Rusage) Pid_t __asm__ ("wait4")
+
+func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
+	var status int
+	r := libc_wait4(Pid_t(pid), &status, options, rusage)
+	wpid = int(r)
+	if r < 0 {
+		errno = GetErrno()
+	}
+	if wstatus != nil {
+		*wstatus = WaitStatus(status)
+	}
+	return
+}
diff -r a602eae61cf4 libgo/syscalls/waitpid.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/waitpid.go	Wed Mar 30 16:01:55 2011 -0700
@@ -0,0 +1,22 @@
+// waitpid.go -- Wait4 for systems without wait4, but with waitpid.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+func libc_waitpid(Pid_t, *int, int) Pid_t __asm__ ("waitpid")
+
+func Wait4(pid int, wstatus *WaitStatus, options int, rusage *Rusage) (wpid int, errno int) {
+	var status int
+	r := libc_waitpid(Pid_t(pid), &status, options)
+	wpid = int(r)
+	if r < 0 {
+		errno = GetErrno()
+	}
+	if wstatus != nil {
+		*wstatus = WaitStatus(status)
+	}
+	return
+}


[patch][cprop.c] Clean up hash table building

2011-03-30 Thread Steven Bosscher
Hi,

This is the first cleanup for cprop.c. These cleanups are only
possible now, thanks to splitting the CPROP code out from gcse.c

The first change is to not treat unfolded conditions as constant in
gcse_constant_p(). This never happens because:
- during hash table building any such condition should have been
simplified by any prior pass
- during cprop the expression table is not updated because the
patterns are unshared

This causes no changes in code generation for all cc1-i files on
x86_64-unknown-linux-gnu. I could add an assert to make sure this kind
of condition really never happens, but IMHO it's not worth it and the
bug would be elsewhere anyway.

The second change is to remove oprs_available_p(). After the
gcse_constant_p() cleanup, we do not have to traverse the whole
pattern of a SET candidate for CPROP, because we know that SET_DEST is
a REG and SET_SRC is a REG or a shareable constant. Therefore a simple
check to see if the registers involved were set in the block is
sufficient, and a regset can be used instead of the last_set table.
See reg_available_p().

The third change is to compute_hash_table_work(), which now traverses
the insns in each basic block just once (instead of twice), in reverse
order to record all registers set between BB_END and the current insn.
This change allows further simplify hash_scan_set() which now doesn't
have to look at INSN+1 anymore (saving another half insns stream
traversal by avoiding next_nonnote_nondebug_insn()).

Bootstrapped and tested on x86_64-unknown-linux-gnu. OK?

Ciao!
Steven
* cprop.c: Clean up hash table building.
(reg_avail_info): Remove.
(oprs_available_p): Remove.
(record_last_reg_set_info): Remove.
(record_last_set_info): Remove.
(reg_available_p): New function.
(gcse_constant_p): Do not treat unfolded conditions as constants.
(make_set_regs_unavailable): New function.
(hash_scan_set): Simplify with new reg_available_p.
(compute_hash_table_work): Traverse insns stream only once.
Do not compute reg_avail_info. Traverse insns in reverse order.
Record implicit sets after recording explicit sets from the block.

Index: cprop.c
===
*** cprop.c (revision 171627)
--- cprop.c (working copy)
*** static rtx *implicit_sets;
*** 118,124 
  
  /* Bitmap containing one bit for each register in the program.
 Used when performing GCSE to track which registers have been set since
!the start of the basic block.  */
  static regset reg_set_bitmap;
  
  /* Various variables for statistics gathering.  */
--- 118,124 
  
  /* Bitmap containing one bit for each register in the program.
 Used when performing GCSE to track which registers have been set since
!the start or end of the basic block while traversing that block.  */
  static regset reg_set_bitmap;
  
  /* Various variables for statistics gathering.  */
*** free_gcse_mem (void)
*** 183,261 
FREE_REG_SET (reg_set_bitmap);
  }
  
! struct reg_avail_info
! {
!   basic_block last_bb;
!   int last_set;
! };
! 
! static struct reg_avail_info *reg_avail_info;
! static basic_block current_bb;
! 
! /* Return nonzero if the operands of expression X are unchanged from
!INSN to the end of INSN's basic block.  */
  
  static int
! oprs_available_p (const_rtx x, const_rtx insn)
  {
!   int i, j;
!   enum rtx_code code;
!   const char *fmt;
! 
!   if (x == 0)
! return 1;
! 
!   code = GET_CODE (x);
!   switch (code)
! {
! case REG:
!   {
!   struct reg_avail_info *info = ®_avail_info[REGNO (x)];
! 
!   if (info->last_bb != current_bb)
! return 1;
!   return info->last_set < DF_INSN_LUID (insn);
!   }
! 
! case PRE_DEC:
! case PRE_INC:
! case POST_DEC:
! case POST_INC:
! case PRE_MODIFY:
! case POST_MODIFY:
!   return 0;
! 
! case PC:
! case CC0: /*FIXME*/
! case CONST:
! case CONST_INT:
! case CONST_DOUBLE:
! case CONST_FIXED:
! case CONST_VECTOR:
! case SYMBOL_REF:
! case LABEL_REF:
! case ADDR_VEC:
! case ADDR_DIFF_VEC:
!   return 1;
! 
! default:
!   break;
! }
! 
!   for (i = GET_RTX_LENGTH (code) - 1, fmt = GET_RTX_FORMAT (code); i >= 0; 
i--)
! {
!   if (fmt[i] == 'e')
!   {
! if (! oprs_available_p (XEXP (x, i), insn))
!   return 0;
!   }
!   else if (fmt[i] == 'E')
!   for (j = 0; j < XVECLEN (x, i); j++)
! if (! oprs_available_p (XVECEXP (x, i, j), insn))
!   return 0;
! }
! 
!   return 1;
  }
  
  /* Hash a set of register REGNO.
--- 183,195 
FREE_REG_SET (reg_set_bitmap);
  }
  
! /* Return nonzero if register X is unchanged from INSN to the end
!of INSN's basic block.  */
  
  static int
! reg_available_p (const_rtx x, const_rtx insn ATTRIBUTE_UNUSED)
  {
!   return ! REGNO_REG_SET_P (r

Re: [libgo] Provide strerror_r replacement (PR go/47515)

2011-03-30 Thread Ian Lance Taylor
Rainer Orth  writes:

> Apart from the lack of wait4, libgo on IRIX 6.5 contained an undefined
> reference to strerror_r.  This patch provides a replacement, based on
> gnulib/lib/strerror_r.c, but massively simplified.

I addressed this in a different way, with a new variant of
syscall.Errstr written in Go.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r ad3d14acbce1 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 14:45:29 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 15:31:01 2011 -0700
@@ -1264,7 +1264,11 @@
 if LIBGO_IS_RTEMS
 syscall_errstr_file = syscalls/errstr_rtems.go
 else
+if HAVE_STRERROR_R
 syscall_errstr_file = syscalls/errstr.go
+else
+syscall_errstr_file = syscalls/errstr_nor.go
+endif
 endif
 
 # Declare libc_strerror_r which is the Go name for strerror_r.
@@ -1273,7 +1277,7 @@
 syscall_errstr_decl_file = syscalls/errstr_decl_rtems.go
 else
 if LIBGO_IS_LINUX
-# In Linux the POSIX strerror_r is called __xpg_strerror_r.
+# On GNU/Linux the POSIX strerror_r is called __xpg_strerror_r.
 syscall_errstr_decl_file = syscalls/errstr_decl_linux.go
 else
 # On other systems we hope strerror_r is just strerror_r.
diff -r ad3d14acbce1 libgo/configure.ac
--- a/libgo/configure.ac	Wed Mar 30 14:45:29 2011 -0700
+++ b/libgo/configure.ac	Wed Mar 30 15:31:01 2011 -0700
@@ -380,7 +380,9 @@
 
 AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h)
 AM_CONDITIONAL(HAVE_SYS_MMAN_H, test "$ac_cv_header_sys_mman_h" = yes)
-AC_CHECK_FUNCS(srandom random strsignal)
+
+AC_CHECK_FUNCS(srandom random strerror_r strsignal)
+AM_CONDITIONAL(HAVE_STRERROR_R, test "$ac_cv_func_strerror_r" = yes)
 
 AC_CACHE_CHECK([for __sync_bool_compare_and_swap_4],
 [libgo_cv_func___sync_bool_compare_and_swap_4],
diff -r ad3d14acbce1 libgo/syscalls/errstr_nor.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/syscalls/errstr_nor.go	Wed Mar 30 15:31:01 2011 -0700
@@ -0,0 +1,32 @@
+// errstr.go -- Error strings when there is no strerror_r.
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package syscall
+
+import (
+	"sync"
+	"unsafe"
+)
+
+func libc_strerror(int) *byte __asm__ ("strerror")
+
+var errstr_lock sync.Mutex
+
+func Errstr(errno int) string {
+	errstr_lock.Lock()
+
+	bp := libc_strerror(errno)
+	b := (*[1000]byte)(unsafe.Pointer(bp))
+	i := 0
+	for b[i] != 0 {
+		i++
+	}
+	s := string(b[:i])
+
+	errstr_lock.Unlock()
+
+	return s
+}


Re: [libgo] Improve Solaris 2/SPARC support

2011-03-30 Thread Ian Lance Taylor
Rainer Orth  writes:

> * go-test.exp wasn't updated for the change from sparcv9 to sparc64.
>   While I still don't agree with the new name, at least the two should
>   be consistent.

I committed this earlier today.

> * env.go needs to accept sparc and sparc64.

I fixed this in a different way.

> * Just like 32-bit Solaris 2/x86, 32-bit Solaris 2/SPARC needs to use
>   the largefile variants of several functions.  I've not introduced a
>   new LIBGO_IS_SOLARIS32 conditional for that, but perhaps one should?

I just committed this one, as appended.

Thanks for sending them.

Ian

diff -r 50a941f17e57 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 10:36:43 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 14:42:46 2011 -0700
@@ -689,8 +689,12 @@
 if LIBGO_IS_386
 go_os_dir_file = go/os/dir_largefile.go
 else
+if LIBGO_IS_SPARC
+go_os_dir_file = go/os/dir_largefile.go
+else
 go_os_dir_file = go/os/dir_regfile.go
 endif
+endif
 else
 if LIBGO_IS_LINUX
 go_os_dir_file = go/os/dir_largefile.go
@@ -1219,16 +1223,21 @@
 syscall_stat_file = syscalls/sysfile_stat_largefile.go
 else # !LIBGO_IS_LINUX
 if LIBGO_IS_SOLARIS
-# FIXME: Same for sparc vs. sparc64.  Introduce new/additional conditional?
 if LIBGO_IS_386
-# Use lseek64 on 386 Solaris.
+# Use lseek64 on 32-bit Solaris/x86.
 syscall_filesize_file = syscalls/sysfile_largefile.go
 syscall_stat_file = syscalls/sysfile_stat_largefile.go
-else # !LIBGO_IS_LINUX && LIBGO_IS_SOLARIS && !LIBGO_IS_386
-# Use lseek on amd64 Solaris.
+else # !LIBGO_IS_386
+if LIBGO_IS_SPARC
+# Use lseek64 on 32-bit Solaris/SPARC.
+syscall_filesize_file = syscalls/sysfile_largefile.go
+syscall_stat_file = syscalls/sysfile_stat_largefile.go
+else # !LIBGO_IS_386 && !LIBGO_IS_SPARC
+# Use lseek on 64-bit Solaris.
 syscall_filesize_file = syscalls/sysfile_regfile.go
 syscall_stat_file = syscalls/sysfile_stat_regfile.go
-endif # !LIBGO_IS_386
+endif # !LIBGO_IS_386 && !LIBGO_IS_SPARC
+endif # !LIBGO_IS_SOLARIS
 else # !LIBGO_IS_LINUX && !LIBGO_IS_SOLARIS
 # Use lseek by default.
 syscall_filesize_file = syscalls/sysfile_regfile.go


[google] Disable guality tests (issue4328047)

2011-03-30 Thread Diego Novillo

The guality tests are failing/passing intermittently.  This confuses
our builders which expect 0 failures and 0 xpasses.  When they "work",
it's not that they are working but dejagnu failed to launch gdb:

-
spawn [open ...]
gdb: took too long to attach
testcase [...]/gcc/testsuite/gcc.dg/guality/guality.exp
completed in 17 seconds
-

And when they do execute, they show a bunch of XPASS results:

-
XPASS: gcc.dg/guality/example.c  -O0  execution test
XPASS: gcc.dg/guality/example.c  -O1  execution test
XPASS: gcc.dg/guality/example.c  -O2  execution test
XPASS: gcc.dg/guality/example.c  -Os  execution test
XPASS: gcc.dg/guality/example.c  -O2 -flto -flto-partition=none execution test
XPASS: gcc.dg/guality/example.c  -O2 -flto  execution test
XPASS: gcc.dg/guality/guality.c  -O0  execution test
XPASS: gcc.dg/guality/guality.c  -O1  execution test
XPASS: gcc.dg/guality/guality.c  -O2  execution test
XPASS: gcc.dg/guality/guality.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/guality.c  -O3 -g  execution test
XPASS: gcc.dg/guality/guality.c  -Os  execution test
XPASS: gcc.dg/guality/guality.c  -O2 -flto -flto-partition=none execution test
XPASS: gcc.dg/guality/guality.c  -O2 -flto  execution test
XPASS: gcc.dg/guality/inline-params.c  -O2  execution test
XPASS: gcc.dg/guality/inline-params.c  -O3 -fomit-frame-pointer execution test
XPASS: gcc.dg/guality/inline-params.c  -O3 -g  execution test
XPASS: gcc.dg/guality/inline-params.c  -Os  execution test
XPASS: gcc.dg/guality/pr41353-1.c  -O0  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O1  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O3 -fomit-frame-pointer  line 28 j == 28 + 
37
XPASS: gcc.dg/guality/pr41353-1.c  -O3 -g  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -Os  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2 -flto -flto-partition=none line 28 j == 
28 + 37
XPASS: gcc.dg/guality/pr41353-1.c  -O2 -flto  line 28 j == 28 + 37
XPASS: gcc.dg/guality/pr41447-1.c  -O0  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O1  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O2  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -O3 -g  execution test
XPASS: gcc.dg/guality/pr41447-1.c  -Os  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O0  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O1  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O2  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O3 -fomit-frame-pointer  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -O3 -g  execution test
XPASS: gcc.dg/guality/pr41616-1.c  -Os  execution test
XPASS: gcc.dg/guality/sra-1.c  -O0  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O1  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O3 -fomit-frame-pointer  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O3 -g  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -Os  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2 -flto -flto-partition=none  line 43 a.j == 14
XPASS: gcc.dg/guality/sra-1.c  -O2 -flto  line 43 a.j == 14
-

Perhaps it's just a matter of removing the XFAILs, but I have seen
spurious FAILs in the past and I'm not sure why they are passing if
they were meant to fail.

Alex, are the above tests supposed to fail or not?  This is on x86_64
with dejagnu combinations unix/-m32 and unix/-m64.

For now, I am disabling these tests in the google branches.

Tested on x86_64.  Committed to google/integration.  Will merge into
the other google branches as well.


2011-03-30  Diego Novillo  

* g++.dg/guality/guality.exp: Disable.
* gcc.dg/guality/guality.exp: Disable.
* gfortran.dg/guality/guality.exp: Disable.

diff --git a/gcc/testsuite/g++.dg/guality/guality.exp 
b/gcc/testsuite/g++.dg/guality/guality.exp
index 9a17850..a07a628 100644
--- a/gcc/testsuite/g++.dg/guality/guality.exp
+++ b/gcc/testsuite/g++.dg/guality/guality.exp
@@ -1,5 +1,8 @@
 # This harness is for tests that should be run at all optimisation levels.
 
+# Disable everywhere.  These tests are very flaky.
+return
+
 load_lib g++-dg.exp
 load_lib gcc-gdb-test.exp
 
diff --git a/gcc/testsuite/gcc.dg/guality/guality.exp 
b/gcc/testsuite/gcc.dg/guality/guality.exp
index 49e2ac5..39bf4e3 100644
--- a/gcc/testsuite/gcc.dg/guality/guality.exp
+++ b/gcc/testsuite/gcc.dg/guality/guality.exp
@@ -1,5 +1,8 @@
 # This harness is for tests that should be run at all optimisation levels.
 
+# Disable everywhere.  These tests are very flaky.
+return
+
 load_lib gcc-dg.exp
 loa

[pph] Write language specific data (issue4331046)

2011-03-30 Thread Diego Novillo

This patch implements the writing of DECL_LANG_SPECIFIC fields.  It's
needed to write global_namespace.  The reading part is still
incomplete, but I wanted to flush this out before it got too big.

The main changes:

- We can now write out references to TEMPLATE_DECLs.  They are stored
  in the same index table as all other DECLs.  To implement this, I
  added a streamer hook that's called from lto_output_tree_ref.  When
  it does not recognize the decl, it calls the hook which tells it
  whether the decl can be added to the index table.

- We stream out everything in DECL_LANG_SPECIFIC, which in C++ is
  quite a bit.  This will cover global_namespace since it goes 
  through all the data stored for NAMESPACE_DECLs.

- Since DECL_LANG_SPECIFIC can contains tokens, we need to save token
  caches.  Since we are inside a streamer callback, we do not have a
  pointer to a PPH stream, so I've added a pretty hacky way of getting
  back to it.  When we initially open the PPH stream and associate
  with it an output_block object, I take an unused field from
  output_block and use it as my pointer back to the PPH stream.  This
  is awful, but it works for now.  I'll clean this up in a future
  patch.

The next patch will add the reader side, which will allow me to
reconstruct global_namespace.  This will uncover what other tree types
we need to handle.

With this I'm down to two failures in pph.exp.  No changes for
pth.exp.


ChangeLog.pph
2011-03-30  Diego Novillo  

* lto-streamer-in.c (lto_input_chain): Make extern.
* lto-streamer-out.c (lto_output_tree_ref): Call
streamer_hook->indexable_with_decls_p, if it exists.  If it
returns true, emit a reference to EXPR in the VAR_DECL index.
(lto_output_chain): Make extern.
(lto_output_tree_pointers): Move checks for invalid gimple trees ...
* lto-streamer.c (lto_is_streamable): ... here.
* lto-streamer.h (lto_streamer_hooks): Add
indexable_with_decls_p.
(lto_output_chain): Declare.
(lto_input_chain): Declare.


c-family/ChangeLog.pph
2011-03-30  Diego Novillo  

* c-family/c.opt (fpph-fmt): Remove.  Update all users.


cp/ChangeLog.pph
2011-03-30  Diego Novillo  

* cp-tree.h (struct language_function): Add prefix 'x_' to
fields returns_value, returns_null, returns_abnormally,
in_function_try_handler, in_base_initializer.  Update all
users.
* pph-streamer.c (pph_stream_write_ld_base): New.
(pph_stream_write_ld_min): New.
(pph_stream_write_tree_vec): New.
(pph_stream_write_cxx_binding_1): New.
(pph_stream_write_cxx_binding): New.
(pph_stream_write_class_binding): New.
(pph_stream_write_label_binding): New.
(pph_stream_write_binding_level): New.
(pph_stream_write_c_language_function): New.
(pph_stream_write_language_function): New.
(pph_stream_write_ld_fn): New.
(pph_stream_write_ld_ns): New.
(pph_stream_write_ld_parm): New.
(pph_stream_write_lang_specific_data): New.
(pph_indexable_with_decls_p): New.
(pph_stream_hooks_init): Initialize h->indexable_with_decls_p
with pph_indexable_with_decls_p.
(pph_stream_begin_section): Do not free BLOCK.
* pph-streamer.h (pth_save_token_cache): Declare.
(pph_get_ob_stream): New.
(pph_set_ob_stream): New.
* pph.c (pth_save_token_cache): New.
(pph_print_macro_defs_before): Remove.
(pph_print_macro_defs_after): Remove.
(pph_write_namespace): Remove.
(pph_write_format): Remove.
(pph_write_print): Remove.
(pph_write_dump): Remove.
(pph_write_symbol): Remove.
(declvisitor): Remove.
(pph_write_namespace_1): Remove.
(pph_write_namespace): Remove.
(pph_write_file_contents): Rename from pph_write_file_object.
Output global_namespace.
(pph_write_file): Call it.
(pph_write_file_summary): Remove.
(pph_read_file_contents): Rename from pph_file_read_object.
(pph_read_file): Rename from pph_file_read.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 54e7461..12d4f06 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -933,10 +933,6 @@ fpph-decls=
 C++ Joined RejectNegative UInteger Var(flag_pph_decls)
 -fpph-decls=N   Enable declaration identifier output at level N from PPH 
support
 
-fpph-fmt=
-C++ Joined RejectNegative UInteger Var(flag_pph_fmt)
--fpph-fmt=N   Output format is (0) normal (1) pretty summary (2) dump
-
 fpph-hdr=
 C++ ObjC++ Joined MissingArgError(missing filename after %qs)
 -fpph-hdr=   A mapping from .h to .pph
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6ef6e6e..aedcab2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -1012,11 +1012,11 @@ struct GTY(()) language_function {
   tree x_vtt_parm;
   tree x_return_value;
 
-  BOOL_BITFIELD returns_value : 1;
-  BOOL_BITFI

Go testsuite patch committed: Fix env.go for other architectures

2011-03-30 Thread Ian Lance Taylor
I brought over this patch to the Go testsuite from the master
repository.  This avoids an explicit list of supported architectures, so
that we don't have to add each gcc architecture to the list.  Ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

Index: gcc/testsuite/go.test/test/env.go
===
--- gcc/testsuite/go.test/test/env.go	(revision 171697)
+++ gcc/testsuite/go.test/test/env.go	(working copy)
@@ -6,7 +6,10 @@
 
 package main
 
-import os "os"
+import (
+	"os"
+	"runtime"
+)
 
 func main() {
 	ga, e0 := os.Getenverror("GOARCH")
@@ -14,8 +17,8 @@ func main() {
 		print("$GOARCH: ", e0.String(), "\n")
 		os.Exit(1)
 	}
-	if ga != "amd64" && ga != "386" && ga != "arm" {
-		print("$GOARCH=", ga, "\n")
+	if ga != runtime.GOARCH {
+		print("$GOARCH=", ga, "!= runtime.GOARCH=", runtime.GOARCH, "\n")
 		os.Exit(1)
 	}
 	xxx, e1 := os.Getenverror("DOES_NOT_EXIST")


Fix pass49-frag (mudflap support for varargs)

2011-03-30 Thread Michael Matz
Hi,

for some reasons the FAIL of pass49-frag now annoyed me enough to look 
into it, where it didn't do so for the last 50 years it's failing (give or 
take a few).  mudflap has a mean to mark some expressions as not 
interesting to generate checks for, which is used to disable this for the 
indirect refs that are generated for the varargs accesses (naturally these 
can't be checked because nobody registers these stack slots).  That is 
done via build_va_arg_indirect_ref calling mf_mark on the generated 
indirect_ref.

Now, our gimplifier changes this indirect_ref into a mem_ref, making the 
marking (via a hash-table marked_trees in tree-mudflap.c) obsolete.  
Instead of amending the gimplifier to copy the mark whenever changing a 
marked tree, I chose to instead simply generate a memref right from the 
start, which isn't gimplified further, hence the mark stays okay.

Yes, that's somewhat fragile, but so is the whole mudflap code generation.  
And in the abstract it seems better to generate more natural gimple code 
to begin with (of course the operand still is gimplified when necessary, 
but that doesn't destroy the mark as it's done in place).

Why pass49-frag also failed before mem-ref I don't know.  I don't want to 
know.  I'd speculate that it's similar reasons.  Or maybe it wasn't 
failing and I confuse it with some other pass4x-frag that was failing for 
years where nobody cared.

In any case, patch fixes this mudflap testcase and for x86_64-linux that 
was the last failing one.  Regstrapping on x86_64-linux in progress.  Okay 
for trunk?


Ciao,
Michael.
PS: Btw, that varargs at least for x86_64 didn't work with mudflap makes 
me somewhat suspicious about the existence of people using -fmudflap.  
Perhaps we should just remove the whole mudflap code for 4.7.
-- 
* builtins.c (build_va_arg_indirect_ref): Use
build_simple_mem_ref_loc.

Index: builtins.c
===
--- builtins.c  (revision 171675)
+++ builtins.c  (working copy)
@@ -4748,7 +4748,7 @@ std_gimplify_va_arg_expr (tree valist, t
 tree
 build_va_arg_indirect_ref (tree addr)
 {
-  addr = build_fold_indirect_ref_loc (EXPR_LOCATION (addr), addr);
+  addr = build_simple_mem_ref_loc (EXPR_LOCATION (addr), addr);
 
   if (flag_mudflap) /* Don't instrument va_arg INDIRECT_REF.  */
 mf_mark (addr);


Re: [libgo] Improve Solaris 2/SPARC support

2011-03-30 Thread Ian Lance Taylor
Rainer Orth  writes:

> 2011-03-24  Rainer Orth  
>
>   go:
>   * go.test/go-test.exp (go-set-goarch): Use sparc64 for 64-bit SPARC.

> diff -r de1b3baf021b gcc/testsuite/go.test/go-test.exp
> --- a/gcc/testsuite/go.test/go-test.exp   Thu Mar 24 13:19:30 2011 +0100
> +++ b/gcc/testsuite/go.test/go-test.exp   Thu Mar 24 13:22:43 2011 +0100
> @@ -129,7 +129,7 @@
>   if [check_effective_target_ilp32] {
>   set goarch "sparc"
>   } else {
> - set goarch "sparcv9"
> + set goarch "sparc64"
>   }
>   }
>   default {

Thanks, I just committed this patch to mainline.

I'm working on the other ones.

Ian


Re: [4.7] Make ARM -mhard-float and -msoft-float into proper -mfloat-abi= aliases

2011-03-30 Thread Richard Earnshaw

On 30 Mar 2011, at 21:11, "Joseph S. Myers"  wrote:

> On Wed, 2 Mar 2011, Richard Earnshaw wrote:
> 
>> Could you remove the documentation entries for the hard/soft-float
>> aliases please?  They're really only there for legacy reasons.
> 
> Is this the change you want here?
> 
> 2011-03-30  Joseph Myers  
> 
>* config/arm/arm.opt (mhard-float, msoft-float): Mark
>Undocumented.  Remove help text.
>* doc/invoke.texi (ARM Options): Don't document -msoft-float and
>-mhard-float.
> 

Yes, that's great.  Thanks.

R.




C++ PATCH for c++/48212 (ICE after error on invalid constant initializer)

2011-03-30 Thread Jason Merrill
non_const_var_error wasn't considering the possibility that the variable 
is unsuitable for a constant expression because its initializer is 
erroneous.


Tested x86_64-pc-linux-gnu, applied to trunk and 4.6.
commit 95a1f4128e0d9fb5480f0ae83240a6d339cf012e
Author: Jason Merrill 
Date:   Wed Mar 30 16:05:52 2011 -0400

PR c++/48212
* semantics.c (non_const_var_error): Just return if DECL_INITIAL
is error_mark_node.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3300c3f..e444d91 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6753,6 +6753,9 @@ non_const_var_error (tree r)
   tree type = TREE_TYPE (r);
   error ("the value of %qD is not usable in a constant "
 "expression", r);
+  /* Avoid error cascade.  */
+  if (DECL_INITIAL (r) == error_mark_node)
+return;
   if (DECL_DECLARED_CONSTEXPR_P (r))
 inform (DECL_SOURCE_LOCATION (r),
"%qD used in its own initializer", r);
diff --git a/gcc/testsuite/g++.dg/cpp0x/regress/error-recovery1.C 
b/gcc/testsuite/g++.dg/cpp0x/regress/error-recovery1.C
new file mode 100644
index 000..2094d3e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/regress/error-recovery1.C
@@ -0,0 +1,9 @@
+// PR c++/48212
+// { dg-options -std=c++0x }
+
+template < bool > void
+foo ()
+{
+  const bool b =;  // { dg-error "" }
+  foo < b > ();// { dg-error "constant expression" }
+};


Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-03-30 Thread Uros Bizjak
On Wed, Mar 30, 2011 at 9:58 PM, Uros Bizjak  wrote:
> Hello!
>
> Attached ports go to ALPHA architecture.
>
> There are however several problems with the build:
>
> a) Bootstrap compare failure in the gcc/go directory due to binutils
> bug [1], fixed in latest binutils SVN, use --disable-bootstrap
>
> b) alpha doesn't define "struct user_regs_struct" from which "type
> PtraceRegs" is derived. I have manually created PtraceRegs from
> pt_regs structure and patched generated libgo/sysinfo.go in build
> directory after the build broke. However - the comment from sys/user.h
> says that this file is for GDB and GDB only...
>
> c) some weird issue with double definition of "const _SOCK_NONBLOCK"
> in gen-sysinfo.go and consequently sysinfo.go.

d) The definition of "type Stat_t struct" also includes additional
struct and this confuses compilation.

Attached is a fixup of sysinfo.go that is neded for successful build.
Apply this fixup after build breaks and restart build.

Uros.


sysinfo.go.diff
Description: Binary data


Re: [4.7] Make ARM -mhard-float and -msoft-float into proper -mfloat-abi= aliases

2011-03-30 Thread Joseph S. Myers
On Wed, 2 Mar 2011, Richard Earnshaw wrote:

> Could you remove the documentation entries for the hard/soft-float
> aliases please?  They're really only there for legacy reasons.

Is this the change you want here?

2011-03-30  Joseph Myers  

* config/arm/arm.opt (mhard-float, msoft-float): Mark
Undocumented.  Remove help text.
* doc/invoke.texi (ARM Options): Don't document -msoft-float and
-mhard-float.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 171745)
+++ doc/invoke.texi (working copy)
@@ -454,7 +454,7 @@
 -mapcs-reentrant  -mno-apcs-reentrant @gol
 -msched-prolog  -mno-sched-prolog @gol
 -mlittle-endian  -mbig-endian  -mwords-little-endian @gol
--mfloat-abi=@var{name}  -msoft-float  -mhard-float  -mfpe @gol
+-mfloat-abi=@var{name}  -mfpe @gol
 -mfp16-format=@var{name}
 -mthumb-interwork  -mno-thumb-interwork @gol
 -mcpu=@var{name}  -march=@var{name}  -mfpu=@var{name}  @gol
@@ -10079,14 +10079,6 @@
 compile your entire program with the same ABI, and link with a
 compatible set of libraries.
 
-@item -mhard-float
-@opindex mhard-float
-Equivalent to @option{-mfloat-abi=hard}.
-
-@item -msoft-float
-@opindex msoft-float
-Equivalent to @option{-mfloat-abi=soft}.
-
 @item -mlittle-endian
 @opindex mlittle-endian
 Generate code for a processor running in little-endian mode.  This is
Index: config/arm/arm.opt
===
--- config/arm/arm.opt  (revision 171745)
+++ config/arm/arm.opt  (working copy)
@@ -94,8 +94,7 @@
 Specify the name of the target floating point hardware/format
 
 mhard-float
-Target RejectNegative Alias(mfloat-abi=, hard)
-Alias for -mfloat-abi=hard
+Target RejectNegative Alias(mfloat-abi=, hard) Undocumented
 
 mlittle-endian
 Target Report RejectNegative InverseMask(BIG_END)
@@ -122,8 +121,7 @@
 Do not load the PIC register in function prologues
 
 msoft-float
-Target RejectNegative Alias(mfloat-abi=, soft)
-Alias for -mfloat-abi=soft
+Target RejectNegative Alias(mfloat-abi=, soft) Undocumented
 
 mstructure-size-boundary=
 Target RejectNegative Joined Var(structure_size_string)

-- 
Joseph S. Myers
jos...@codesourcery.com


[RFC PATCH, go]: Port to ALPHA arch

2011-03-30 Thread Uros Bizjak
Hello!

Attached ports go to ALPHA architecture.

There are however several problems with the build:

a) Bootstrap compare failure in the gcc/go directory due to binutils
bug [1], fixed in latest binutils SVN, use --disable-bootstrap

b) alpha doesn't define "struct user_regs_struct" from which "type
PtraceRegs" is derived. I have manually created PtraceRegs from
pt_regs structure and patched generated libgo/sysinfo.go in build
directory after the build broke. However - the comment from sys/user.h
says that this file is for GDB and GDB only...

c) some weird issue with double definition of "const _SOCK_NONBLOCK"
in gen-sysinfo.go and consequently sysinfo.go.

The test results from "make -k check" in libgo directory are quite encouraging:

PASS: asn1
PASS: big
PASS: bufio
PASS: bytes
FAIL: cmath
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  1637 Aborted
   ./a.out "$@"
PASS: ebnf
PASS: exec
PASS: expvar
PASS: flag
FAIL: fmt
mallocs per Sprintf(""): 1
mallocs per Sprintf("xxx"): 1
mallocs per Sprintf("%x"): 3
mallocs per Sprintf("%x %x"): 4
panic: runtime error: integer divide by zero or floating point error [recovered]
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  2309 Aborted
   ./a.out "$@"
PASS: gob
PASS: html
FAIL: http
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  3133 Segmentation faul
t  ./a.out "$@"
PASS: io
PASS: json
PASS: log
FAIL: math
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  4321 Aborted
   ./a.out "$@"
PASS: mime
PASS: netchan
PASS: os
PASS: patch
PASS: path
PASS: rand
PASS: reflect
PASS: regexp
FAIL: rpc
2011/03/30 21:31:20 Test RPC server listening on 127.0.0.1:47781
2011/03/30 21:31:20 Test HTTP RPC server listening on 127.0.0.1:59606
2011/03/30 21:31:20 rpc.Serve: accept:accept tcp 127.0.0.1:47781: Resource tempo
rarily unavailable
FAIL: runtime
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  5980 Aborted
   ./a.out "$@"
PASS: scanner
PASS: smtp
PASS: sort
FAIL: strconv
panic: runtime error: integer divide by zero or floating point error
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  6561 Aborted
   ./a.out "$@"
PASS: strings
PASS: sync
PASS: tabwriter
PASS: template
PASS: time
PASS: try
PASS: unicode
PASS: utf16
PASS: utf8
FAIL: websocket
2011/03/30 21:32:19 Test WebSocket server listening on 127.0.0.1:33252
--- FAIL: websocket.TestEcho (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestEchoDraft75 (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestWithQuery (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestWithProtocol (0.0 seconds)
dialing dial tcp 127.0.0.1:33252: Connection refused
--- FAIL: websocket.TestHTTP (0.0 seconds)
Get: error &http.URLError{Op:"Get", URL:"http://127.0.0.1:33252/echo";, E
rror:(*net.OpError)(0xf840026d00)}
--- FAIL: websocket.TestHTTPDraft75 (0.0 seconds)
Get: error &http.URLError{Op:"Get", URL:"http://127.0.0.1:33252/echoDraf
t75", Error:(*net.OpError)(0xf840026c40)}
panic: Dial failed: websocket.Dial ws://127.0.0.1:33252/echo: dial tcp 127.0.0.1
:33252: Connection refused
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  8493 Aborted
   ./a.out "$@"
PASS: xml
PASS: archive/tar
PASS: archive/zip
PASS: compress/bzip2
FAIL: compress/flate
../../../gcc-svn/trunk/libgo/testsuite/gotest: line 356:  9194 Segmentation faul
t  ./a.out "$@"
PASS: compress/gzip
PASS: compress/lzw
PASS: compress/zlib
PASS: container/heap
PASS: container/list
PASS: container/ring
PASS: container/vector
PASS: crypto/aes
PASS: crypto/block
PASS: crypto/blowfish
PASS: crypto/cipher
PASS: crypto/dsa
PASS: crypto/ecdsa
PASS: crypto/elliptic
PASS: crypto/hmac
PASS: crypto/md4
PASS: crypto/md5
PASS: crypto/ocsp
PASS: crypto/openpgp
PASS: crypto/rand
PASS: crypto/rc4
PASS: crypto/ripemd160
PASS: crypto/rsa
PASS: crypto/sha1
PASS: crypto/sha256
PASS: crypto/sha512
PASS: crypto/subtle
PASS: crypto/tls
PASS: crypto/twofish
PASS: crypto/x509
PASS: crypto/xtea
PASS: crypto/openpgp/armor
PASS: crypto/openpgp/packet
PASS: crypto/openpgp/s2k
PASS: debug/dwarf
PASS: debug/elf
PASS: debug/macho
PASS: debug/pe
PASS: encoding/ascii85
PASS: encoding/base32
PASS: encoding/base64
PASS: encoding/binary
PASS: encoding/git85
PASS: encoding/hex
PASS: encoding/line
PASS: encoding/pem
PASS: exp/datafmt
PASS: exp/draw
PASS: exp/eval
PASS: go/parser
PASS: go/printer
PASS: go/scanner
PASS: go/token
PASS: go/typechecker
PASS: hash/adler32
PASS: hash/crc32
PASS: hash/crc64
PASS: hash/fnv
PASS: http/cgi
PASS: image/png
PASS: index/suffixarray
PASS: io/ioutil
PASS: m

[PATCH] Fix VTA updating in the combiner (PR debug/48343)

2011-03-30 Thread Jakub Jelinek
Hi!

We ICE on the attached testcase, because combiner changes mode of
a pseudo (which was set just once and used once plus in debug insns)
from SImode to QImode, but the uses in debug insns aren't adjusted.

Combiner has code to adjust this too (propagate_for_debug), but only updates
debug insns between i2 and i3 (resp. i3 and undobuf.other_insn).
The problem on the testcase is that this is a retry, so first
try_combine with a later i3 calls propagate_for_debug and changes debug
insns before that later i3, then returns an earlier insn that should be
retried and we stop adjusting debug insns at that earlier i3.  Unfortunately
as later debug insns have been already updated earlier, they need to be
adjusted too.

The following patch fixes that by always stopping on the latest i3 that has
been successfully combined into in the current bb, bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk/4.6?

2011-03-30  Jakub Jelinek  

PR debug/48343
* combine.c (last_combined_insn): New variable.
(combine_instructions): Clear it before processing each bb
and after last bb.
(try_combine): Set it to i3 if i3 is after it.
Use it as the last argument to propagate_for_debug instead
of i3.
(distribute_notes): Assert SET_INSN_DELETE is not called
on last_combined_insn.

* gcc.dg/torture/pr48343.c: New test.

--- gcc/combine.c.jj2011-03-23 09:34:40.0 +0100
+++ gcc/combine.c   2011-03-30 17:39:09.0 +0200
@@ -337,6 +337,9 @@ static enum machine_mode nonzero_bits_mo
 
 static int nonzero_sign_valid;
 
+/* Last insn on which try_combine has been called in the current BB.  */
+
+static rtx last_combined_insn;
 
 /* Record one modification to rtl structure
to be undone by storing old_contents into *where.  */
@@ -1168,6 +1171,7 @@ combine_instructions (rtx f, unsigned in
  || single_pred (this_basic_block) != last_bb)
label_tick_ebb_start = label_tick;
   last_bb = this_basic_block;
+  last_combined_insn = NULL_RTX;
 
   rtl_profile_for_bb (this_basic_block);
   for (insn = BB_HEAD (this_basic_block);
@@ -1379,6 +1383,7 @@ combine_instructions (rtx f, unsigned in
}
 }
 
+  last_combined_insn = NULL_RTX;
   default_rtl_profile ();
   clear_log_links ();
   clear_bb_flags ();
@@ -3830,6 +3835,10 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   return 0;
 }
 
+  if (last_combined_insn == NULL_RTX
+  || DF_INSN_LUID (last_combined_insn) < DF_INSN_LUID (i3))
+last_combined_insn = i3;
+
   if (MAY_HAVE_DEBUG_INSNS)
 {
   struct undo *undo;
@@ -3851,7 +3860,7 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   i2src while its original mode is temporarily
   restored, and then clear i2scratch so that we don't
   do it again later.  */
-   propagate_for_debug (i2, i3, reg, i2src);
+   propagate_for_debug (i2, last_combined_insn, reg, i2src);
i2scratch = false;
/* Put back the new mode.  */
adjust_reg_mode (reg, new_mode);
@@ -3859,18 +3868,17 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
else
  {
rtx tempreg = gen_raw_REG (old_mode, REGNO (reg));
-   rtx first, last;
+   rtx first;
 
if (reg == i2dest)
- {
-   first = i2;
-   last = i3;
- }
+ first = i2;
else
  {
first = i3;
-   last = undobuf.other_insn;
-   gcc_assert (last);
+   gcc_assert (undobuf.other_insn);
+   if (DF_INSN_LUID (undobuf.other_insn)
+   > DF_INSN_LUID (last_combined_insn))
+ last_combined_insn = undobuf.other_insn;
  }
 
/* We're dealing with a reg that changed mode but not
@@ -3882,9 +3890,9 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
   with this copy we have created; then, replace the
   copy with the SUBREG of the original shared reg,
   once again changed to the new mode.  */
-   propagate_for_debug (first, last, reg, tempreg);
+   propagate_for_debug (first, last_combined_insn, reg, tempreg);
adjust_reg_mode (reg, new_mode);
-   propagate_for_debug (first, last, tempreg,
+   propagate_for_debug (first, last_combined_insn, tempreg,
 lowpart_subreg (old_mode, reg, new_mode));
  }
  }
@@ -4099,14 +4107,14 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
 if (newi2pat)
   {
if (MAY_HAVE_DEBUG_INSNS && i2scratch)
- propagate_for_debug (i2, i3, i2dest, i2src);
+ propagate_for_debug (i2, last_combined_insn, i2dest, i2src);
INSN_CODE

C++ PATCH for c++/48369 (ICE with isnan)

2011-03-30 Thread Jason Merrill

A couple of missed cases.

Tested x86_64-pc-linux-gnu, applying to trunk and 4.6.
commit cea17025eb232f3931ce34e16dedff7fd42d2478
Author: Jason Merrill 
Date:   Wed Mar 30 15:17:27 2011 -0400

PR c++/48369
* semantics.c (potential_constant_expression_1): Handle
UNORDERED_EXPR and ORDERED_EXPR.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index b88e190..48dd4ee 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7741,6 +7741,8 @@ potential_constant_expression_1 (tree t, bool want_rval, 
tsubst_flags_t flags)
 case BIT_XOR_EXPR:
 case BIT_AND_EXPR:
 case TRUTH_XOR_EXPR:
+case UNORDERED_EXPR:
+case ORDERED_EXPR:
 case UNLT_EXPR:
 case UNLE_EXPR:
 case UNGT_EXPR:
diff --git a/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C 
b/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C
new file mode 100644
index 000..40d07e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/regress/isnan.C
@@ -0,0 +1,9 @@
+// PR c++/48369
+// { dg-options -std=gnu++0x }
+
+extern "C" int isnan (double);
+
+void f(double d)
+{
+bool b = isnan(d);
+}


Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread H.J. Lu
On Wed, Mar 30, 2011 at 7:23 AM, Jeff Law  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> As originally discussed here:
>
> http://gcc.gnu.org/ml/gcc-patches/2010-05/msg00987.html
>
> Changes since the original submission:
>
> Per Richi's suggestion I changed the patch to use a GC'd VEC.  I defined
> accessor macros so that a ton of reformatting isn't needed and finally I
> made the appropriate tweaks to the few backends that peeked at the
> reg_equiv arrays.
>
> Given the changes and the length of time since the original submission,
> I think it's probably best to get a fresh approval rather than rely on
> the prior approval.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>

It breaks GCC bootstrap on Linux/x86:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48371

-- 
H.J.


Merge mainline to gccgo branch again

2011-03-30 Thread Ian Lance Taylor
I merged mainline revision 171737 onto the gccgo branch.

Ian


[PATCH] PR debug/47471 (set prologue_end in .debug_line)

2011-03-30 Thread Dodji Seketeli
Hello,

This is about the line program emitted by the compiler into the
.debug_line section, without optimization.  In the example
accompanying the patch below, at the beginning of the function f, the
compiler emits two .loc asm directives that are identical.  The first
one is right before the .cfi_startproc that starts the prologue.  The
second one is before the instructions that copy the variable length
parameters into f's frame.  Both directives do locate instructions
that are in the prologue.

Unfortunately, GDB uses an heuristic that basically considers that the
first opcode of the line program that increments the line register
(even with an increment of zero) marks the end of the prologue.

Effectively, setting a breakpoint to the beginning of f (e.g, "break
f") won't work anymore when we emit this because GDB would then try to
set the breakpoint inside the prologue.

This patch does two things.

First, it avoids emitting two consecutive .loc that are identical.
Strictly speaking that should fix this issue in this particular case.

Second, it emits a '.loc   0 prologue_end' directive
on the first instruction that doesn't belong to the prologue (i.e,
when the end_prologue debug hook is called).  This sets the
prologue_end register in the .debug_line state machine to true.  After
discussing this with Jan (in CC), it appears that setting the
prologue_end register to true will help GDB to drop the somewhat
non-reliable heuristic it uses to detect the end of the prologue.

I have noticed that the end_prologue debug hook was being called, but
the dwarf back-end hasn't implemented it (on non-vms platforms).  Is
there a particular reason for not implementing this?  Also, I have
noticed that the patch causes the prologue_end directive to be emitted
even when we compile with optimizations.  I am not sure how right that
would be.

Tested on x86_64-unknown-linux-gnu against trunk.

-- 
Dodji

>From dc3dea429f1471540867a0b7e694a60494062ac0 Mon Sep 17 00:00:00 2001
From: Dodji Seketeli 
Date: Tue, 29 Mar 2011 16:56:20 +0200
Subject: [PATCH] PR debug/47471 (set prologue_end in .debug_line)

gcc/
* dwarf2out.c (output_source_line_asm_info):  Split out of
dwarf2out_source_line.  Add a new is_prologue_end parameter.
Avoid emitting redundant consecutive .loc asm directives.
(dwarf2out_source_line):  Use output_source_line_asm.
(dwarf2out_end_prologue): New function.
(dwarf2_debug_hooks->end_prologue): Set to dwarf2out_end_prologue.

gcc/testsuite/
* gcc.dg/debug/dwarf2/line-prog-1.c: New test.
---
 gcc/dwarf2out.c |   95 +--
 1 files changed, 78 insertions(+), 17 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index efd30ea..6f8285c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -94,6 +94,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-flow.h"
 #include "cfglayout.h"
 
+static void output_source_line_asm_info (unsigned int, const char *,
+int, bool, bool);
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx last_var_location_insn;
 
@@ -465,6 +467,7 @@ static void output_call_frame_info (int);
 static void dwarf2out_note_section_used (void);
 static bool clobbers_queued_reg_save (const_rtx);
 static void dwarf2out_frame_debug_expr (rtx, const char *);
+static void dwarf2out_end_prologue (unsigned int, const char*);
 
 /* Support for complex CFA locations.  */
 static void output_cfa_loc (dw_cfi_ref, int);
@@ -4125,6 +4128,21 @@ dwarf2out_begin_prologue (unsigned int line 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* Implementation of the gcc_debug_hooks::end_prologue hook.
+
+   If the underlying assembler supports it, emit a .loc asm directive
+   with a 'end_prologue' argument, effectively marking the point where
+   debugger should set a breakpoint when requested to set one on a
+   function.  */
+
+static void
+dwarf2out_end_prologue (unsigned int line, const char *filename)
+{
+  output_source_line_asm_info (line, filename, 0,
+  /*is_stmt=*/true,
+  /*is_prologue_end*/true);
+}
+
 /* Output a marker (i.e. a label) for the end of the generated code
for a function prologue.  This gets called *after* the prologue code has
been generated.  */
@@ -5767,7 +5785,7 @@ const struct gcc_debug_hooks dwarf2_debug_hooks =
   dwarf2out_vms_end_prologue,
   dwarf2out_vms_begin_epilogue,
 #else
-  debug_nothing_int_charstar,
+  dwarf2out_end_prologue,
   debug_nothing_int_charstar,
 #endif
   dwarf2out_end_epilogue,
@@ -22129,14 +22147,22 @@ dwarf2out_begin_function (tree fun)
 
 /* Output a label to mark the beginning of a source code line entry
and record information relating to this source line, in
-   'line_info_table' for later output of the .debug_line section.  */
+   'line_info_table' for later output of the .debug_line secti

RFC: PATCH: Remove 26 element limit in vector

2011-03-30 Thread H.J. Lu
On Wed, Mar 30, 2011 at 08:02:38AM -0700, H.J. Lu wrote:
> Hi,
> 
> Currently, we limit XVECEXP to 26 elements in machine description
> since we use letters 'a' to 'z' to encode them.  I don't see any
> reason why we can't go beyond 'z'.  This patch removes this restriction.
> Any comments?
> 

That was wrong.  The problem is in vector elements.  This patch passes
bootstrap.  Any comments?

Thanks.


H.J.
---
2011-03-30  H.J. Lu  

* genrecog.c (VECTOR_ELEMENT_BASE): New.
(add_to_sequence): Add assert.  Use VECTOR_ELEMENT_BASE to
encode vector elements.
(change_state): Check and support VECTOR_ELEMENT_BASE.
(make_insn_sequence): Add assert.

diff --git a/gcc/genrecog.c b/gcc/genrecog.c
index 74dd0a7..40e9c4d 100644
--- a/gcc/genrecog.c
+++ b/gcc/genrecog.c
@@ -465,6 +465,9 @@ extern void debug_decision
   (struct decision *);
 extern void debug_decision_list
   (struct decision *);
+
+/* The base of vector element.  */
+#define VECTOR_ELEMENT_BASE 0x80
 
 /* Create a new node in sequence after LAST.  */
 
@@ -912,6 +915,7 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
  /* Which insn we're looking at is represented by A-Z. We don't
 ever use 'A', however; it is always implied.  */
 
+ gcc_assert (i < 26);
  subpos[depth] = (i > 0 ? 'A' + i : 0);
  sub = add_to_sequence (XVECEXP (pattern, 0, i),
 last, subpos, insn_type, 0);
@@ -1002,6 +1006,9 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
char base = (was_code == MATCH_OPERATOR ? '0' : 'a');
for (i = 0; i < (size_t) XVECLEN (pattern, 2); i++)
  {
+   gcc_assert (was_code == MATCH_OPERATOR
+   ? ISDIGIT (i + base)
+   : ISLOWER (i + base));
subpos[depth] = i + base;
sub = add_to_sequence (XVECEXP (pattern, 2, i),
   &sub->success, subpos, insn_type, 0);
@@ -1102,7 +1109,9 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
int j;
for (j = 0; j < XVECLEN (pattern, i); j++)
  {
-   subpos[depth] = 'a' + j;
+   int val = j + VECTOR_ELEMENT_BASE;
+   gcc_assert (val <= UCHAR_MAX);
+   subpos[depth] = val;
sub = add_to_sequence (XVECEXP (pattern, i, j),
   &sub->success, subpos, insn_type, 0);
  }
@@ -1779,6 +1788,10 @@ change_state (const char *oldpos, const char *newpos, 
const char *indent)
   else if (ISLOWER (newpos[depth]))
printf ("%sx%d = XVECEXP (x%d, 0, %d);\n",
indent, depth + 1, depth, newpos[depth] - 'a');
+  else if (((unsigned char) newpos[depth]) >= VECTOR_ELEMENT_BASE)
+   printf ("%sx%d = XVECEXP (x%d, 0, %d);\n",
+   indent, depth + 1, depth,
+   ((unsigned char) newpos[depth]) - VECTOR_ELEMENT_BASE);
   else
printf ("%sx%d = XEXP (x%d, %c);\n",
indent, depth + 1, depth, newpos[depth]);
@@ -2528,6 +2541,7 @@ make_insn_sequence (rtx insn, enum routine_type type)
}
   XVECLEN (x, 0) = j;
 
+  gcc_assert ((j - 1) < 26);
   c_test_pos[0] = 'A' + j - 1;
   c_test_pos[1] = '\0';
 }


C++ PATCH for c++/48281 (ICE with nested initializer_list)

2011-03-30 Thread Jason Merrill
The testcase in 48281 started breaking in 4.6.0 because of my change to 
finish_compound_literal to stop putting constant compound literals into 
static variables, because it was interfering with constexpr.


First I noticed that the crash was due to non-constant CONSTRUCTORs with 
TREE_CONSTANT set.  So I fixed that, in constructor-const.patch.


That change fixed the crash, but then we were just optimizing away the 
entire contents of the lists due to a bug I just filed as PR 48370.


Fixing the bug in general will mean fixing 48370, but we can fix the 
regression by just reverting the change above; I dealt with a similar 
issue elsewhere by setting DECL_DECLARED_CONSTEXPR_P on the temporary 
variable so that constexpr evaluation can look at its initializer, so we 
can use the same approach here.  I'm only doing this for arrays now so 
that we can continue to elide copies of class temporaries.


While I was looking at this stuff, I decided to go ahead and initialize 
the initializer_list objects directly rather than mess with calling the 
constructor.


Tested x86_64-pc-linux-gnu, applying to trunk (all three patches) and 
4.6 (only the first patch).
commit 15ce1df0c9f6259babc885e8947719a586a47103
Author: Jason Merrill 
Date:   Wed Mar 30 11:57:50 2011 -0400

PR c++/48281
* semantics.c (finish_compound_literal): Do put static/constant
arrays in static variables.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 9926d26..b88e190 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2329,7 +2329,34 @@ finish_compound_literal (tree type, tree 
compound_literal)
   if (TREE_CODE (type) == ARRAY_TYPE)
 cp_complete_array_type (&type, compound_literal, false);
   compound_literal = digest_init (type, compound_literal);
-  return get_target_expr (compound_literal);
+  /* Put static/constant array temporaries in static variables, but always
+ represent class temporaries with TARGET_EXPR so we elide copies.  */
+  if ((!at_function_scope_p () || CP_TYPE_CONST_P (type))
+  && TREE_CODE (type) == ARRAY_TYPE
+  && initializer_constant_valid_p (compound_literal, type))
+{
+  tree decl = create_temporary_var (type);
+  DECL_INITIAL (decl) = compound_literal;
+  TREE_STATIC (decl) = 1;
+  if (literal_type_p (type) && CP_TYPE_CONST_NON_VOLATILE_P (type))
+   {
+ /* 5.19 says that a constant expression can include an
+lvalue-rvalue conversion applied to "a glvalue of literal type
+that refers to a non-volatile temporary object initialized
+with a constant expression".  Rather than try to communicate
+that this VAR_DECL is a temporary, just mark it constexpr.  */
+ DECL_DECLARED_CONSTEXPR_P (decl) = true;
+ DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = true;
+ TREE_CONSTANT (decl) = true;
+   }
+  cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
+  decl = pushdecl_top_level (decl);
+  DECL_NAME (decl) = make_anon_name ();
+  SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl));
+  return decl;
+}
+  else
+return get_target_expr (compound_literal);
 }
 
 /* Return the declaration for the function-name variable indicated by
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist46.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist46.C
new file mode 100644
index 000..2b9f07d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist46.C
@@ -0,0 +1,14 @@
+// PR c++/48281
+// { dg-options "-std=c++0x -O2" }
+// { dg-do run }
+
+#include 
+
+typedef std::initializer_list  int1;
+typedef std::initializer_list int2;
+static int2 ib = {{42,2,3,4,5},{2,3,4,5,1},{3,4,5,2,1}};
+
+int main()
+{
+  return *(ib.begin()->begin()) != 42;
+}
commit 4069e2ded787088dcf1d4caf1aeccd26e00524c0
Author: Jason Merrill 
Date:   Wed Mar 30 10:34:09 2011 -0400

* call.c (convert_like_real) [ck_list]: Build up the
initializer_list object directly.
* decl.c (build_init_list_var_init): Adjust.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index f7d108f..ad2de43 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5467,8 +5467,8 @@ convert_like_real (conversion *convs, tree expr, tree fn, 
int argnum,
tree elttype = TREE_VEC_ELT (CLASSTYPE_TI_ARGS (totype), 0);
tree new_ctor = build_constructor (init_list_type_node, NULL);
unsigned len = CONSTRUCTOR_NELTS (expr);
-   tree array, val;
-   VEC(tree,gc) *parms;
+   tree array, val, field;
+   VEC(constructor_elt,gc) *vec = NULL;
unsigned ix;
 
/* Convert all the elements.  */
@@ -5490,16 +5490,14 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
array = build_array_of_n_type (elttype, len);
array = finish_compound_literal (array, new_ctor);
 
-   parms = make_tree_vector ();
-   VEC_safe_push (tree, gc, parms, decay_conversion (array));
-   VEC_safe_push (tree, gc, parms, size_int (len));
-   /*

libgo patch committed: Add missing Makefile dependencies

2011-03-30 Thread Ian Lance Taylor
I committed this patch to libgo to add some missing Makefile
dependencies.  Bootstrapped on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r 58ba48318471 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Mar 30 08:17:36 2011 -0700
+++ b/libgo/Makefile.am	Wed Mar 30 10:35:55 2011 -0700
@@ -1705,8 +1705,9 @@
 	$(CHECK)
 .PHONY: mime/check
 
-net/net.lo: $(go_net_files) bytes.gox fmt.gox io.gox os.gox reflect.gox \
-		strconv.gox strings.gox sync.gox syscall.gox
+net/net.lo: $(go_net_files) bytes.gox fmt.gox io.gox os.gox rand.gox \
+		reflect.gox strconv.gox strings.gox sync.gox syscall.gox \
+		time.gox
 	$(BUILDPACKAGE)
 net/check: $(CHECK_DEPS)
 	$(CHECK)


Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Tobias Burnus

On 03/30/2011 07:33 PM, Michael Matz wrote:

Hi,

On Wed, 30 Mar 2011, Tobias Burnus wrote:


On 03/30/2011 06:21 PM, Michael Matz wrote:

Okay for trunk?  (regstrapping on x86_64-linux in progress)

OK. Thanks for tracing the bug. I think the issue could be the reason for the
elusive PR 47516 - thus, you might consider adding that PR to the ChangeLog
and close the PR afterwards.

(r171736) Hmm, it seems to also be a problem for 4.6.0.  Should I merge it
to the 4.6 branch too?


That's fine with me - the fix is really simple (after one found the 
cause). Fortunately, the bug only rarely surfaces.


Tobias


Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Michael Matz
Hi,

On Wed, 30 Mar 2011, Tobias Burnus wrote:

> On 03/30/2011 06:21 PM, Michael Matz wrote:
> > Okay for trunk?  (regstrapping on x86_64-linux in progress)
> 
> OK. Thanks for tracing the bug. I think the issue could be the reason for the
> elusive PR 47516 - thus, you might consider adding that PR to the ChangeLog
> and close the PR afterwards.

(r171736) Hmm, it seems to also be a problem for 4.6.0.  Should I merge it 
to the 4.6 branch too?


Ciao,
Michael.


Re: [debug] Introduce -fno-debug-types-section flag

2011-03-30 Thread Richard Henderson
On 03/29/2011 03:20 AM, Mark Wielaard wrote:
> 2011-03-29  Mark Wielaard  
> 
> * common.opt (fdebug-types-section): New flag.
> * doc/invoke.texi: Document new -fno-debug-types-section flag.
> * dwarf2out.c (use_debug_types): New define.
> (struct die_struct): Mark die_id with GTY desc use_debug_types.
> (print_die): Guard output of type unit signatures using
> use_debug_types.
> (build_abbrev_table): Replace assert of dwarf_version >= 4
> with assert on use_debug_types.
> (size_of_die): Likewise.
> (unmark_dies): Likewise.
> (value_format): Decide AT_ref_external form on use_debug_types.
> (output_die): Replace dwarf_version version check guard with
> use_debug_types where appropriate.
> (modified_type_die): Likewise.
> (gen_reference_type_die): Likewise.
> (dwarf2out_start_source_file): Likewise.
> (dwarf2out_end_source_file): Likewise.
> (prune_unused_types_walk_attribs): Likewise.
> (dwarf2out_finish): Likewise.
> 

Ok.


r~


Re: [Patch] improve bfin conditional move support

2011-03-30 Thread Richard Henderson
On 03/29/2011 05:57 AM, Henderson, Stuart wrote:
> -  operands[1] = bfin_gen_compare (operands[1], SImode);
> +  operands[1] = bfin_gen_compare (operands[1], GET_MODE (operands[0]));

FWIW, you can use "mode" to get the proper value
without having to read it from the operand.


r~


Re: [Patch] Bfin: Ensure rotrsi and rotlsi don't accept non-const INTVALS

2011-03-30 Thread Richard Henderson
On 03/29/2011 08:49 AM, Henderson, Stuart wrote:
>(match_operand:SI 2 "immediate_operand" "")))]
>""
>  {
> -  if (INTVAL (operands[2]) != 16)
> +  if (GET_CODE (operands[2]) != CONST_INT || INTVAL (operands[2]) != 16)
>  FAIL;

Perhaps use const_int_operand instead of immediate_operand.


r~


Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Tobias Burnus

On 03/30/2011 06:21 PM, Michael Matz wrote:

Okay for trunk?  (regstrapping on x86_64-linux in progress)


OK. Thanks for tracing the bug. I think the issue could be the reason 
for the elusive PR 47516 - thus, you might consider adding that PR to 
the ChangeLog and close the PR afterwards.


Tobias


Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread Richard Sandiford
Nice cleanup thanks.  Just noticed a couple of things:

Jeff Law  writes:
> *** struct reload
> *** 100,106 
> int inc;
> /* A reg for which reload_in is the equivalent.
>If reload_in is a symbol_ref which came from
> !  reg_equiv_constant, then this is the pseudo
>which has that symbol_ref as equivalent.  */
> rtx in_reg;
> rtx out_reg;
> --- 100,106 
> int inc;
> /* A reg for which reload_in is the equivalent.
>If reload_in is a symbol_ref which came from
> !  reg_equiv_consant, then this is the pseudo
>which has that symbol_ref as equivalent.  */
> rtx in_reg;
> rtx out_reg;

Adds typo.

> *** elimination_effects (rtx x, enum machine
> *** 3002,3011 
> }
>   
>   }
> !   else if (reg_renumber[regno] < 0 && reg_equiv_constant
> !&& reg_equiv_constant[regno]
> !&& ! function_invariant_p (reg_equiv_constant[regno]))
> ! elimination_effects (reg_equiv_constant[regno], mem_mode);
> return;
>   
>   case PRE_INC:
> --- 2996,3006 
> }
>   
>   }
> !   else if (reg_renumber[regno] < 0
> !&& reg_equiv_constant (0)
> !&& reg_equiv_constant (regno)
> !&& ! function_invariant_p (reg_equiv_constant (regno)))
> ! elimination_effects (reg_equiv_constant (regno), mem_mode);
> return;
>   
>   case PRE_INC:

Looks like this should be s/reg_equiv_constant (0)/reg_equivs != 0/.

Richard


Re: RFA: patch to solve IRA PR48336, PR48342, PR48345

2011-03-30 Thread Vladimir Makarov

On 03/30/2011 06:19 AM, Richard Sandiford wrote:

Hi Vlad,

First, I want to echo H-P's thanks for tackling this area.  I just wondered:

Vladimir Makarov  writes:

The following patch is to solve PR48336, PR48342, PR48345.  The
profitable hard regs exclude hard regs which are prohibited for the
corresponding allocno mode. It is true for primary allocation and it is
important for better colorability criteria.  Function assign_hard_reg is
also based on this assumption.  Unfortunately, it is not true for
secondary allocation (after IRA IR flattening or during reload).  The
following patch solves this problem.

The patch should be very safe but I am still testing it on x86/x86-64
bootstrap.

[...]

Index: ira-color.c
===
--- ira-color.c (revision 171699)
+++ ira-color.c (working copy)
@@ -1447,7 +1447,9 @@ update_conflict_hard_regno_costs (int *c
  }

  /* Set up conflicting and profitable regs (through CONFLICT_REGS and
-   PROFITABLE_REGS) for each object of allocno A.  */
+   PROFITABLE_REGS) for each object of allocno A.  Remember that the
+   profitable regs exclude hard regs which can not hold value of mode
+   of allocno A.  */
  static inline void
  setup_conflict_profitable_regs (ira_allocno_t a, bool retry_p,
HARD_REG_SET *conflict_regs,
@@ -1463,8 +1465,13 @@ setup_conflict_profitable_regs (ira_allo
COPY_HARD_REG_SET (conflict_regs[i],
 OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
if (retry_p)
-   COPY_HARD_REG_SET (profitable_regs[i],
-  reg_class_contents[ALLOCNO_CLASS (a)]);
+   {
+ COPY_HARD_REG_SET (profitable_regs[i],
+reg_class_contents[ALLOCNO_CLASS (a)]);
+ AND_COMPL_HARD_REG_SET (profitable_regs[i],
+ ira_prohibited_class_mode_regs
+ [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
+   }
else
COPY_HARD_REG_SET (profitable_regs[i],
   OBJECT_COLOR_DATA (obj)->profitable_hard_regs);

ira_prohibited_class_mode_regs is partly based on HARD_REGNO_MODE_OK,
which is really a property of the first register in a multi-register
group (rather than of every register in that group).  So is
ira_prohibited_class_mode_regs defined in the same way?
At the moment, check_hard_reg_p and setup_allocno_available_regs_num
test profitability for every register in the group:

/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
|| ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
  break;
[...]
   for (j = 0; j<  nregs; j++)
{
[...]
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
  || ! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs,
  hard_regno + j))
break;

So if you have a target in which double-word values have to start in
even registers, I think every odd-numbered bit of profitable_hard_regs
will be clear, and no register will seem profitable.  (I'm seeing this
on ARM with some VFP tests.)

Restricting the test to the first register fixes things for me,
but do we need to check something else for the j != 0 case?


Richard, thanks very much for pointing this out.  I am agree with you.  
I think your patch is ok.  Although I need some time to check it.  Right 
now I am overwhelmed by # of other bug reports.


Another thing bothering me is a new colorability test for pseudos which 
should start on even or odd hard register.  I have suspicion that it is 
not ok for now.  I need some time to check and think about it.


I hope that I finish my urgent work on bug reports this week and start 
to work on your patch and the another issue on next week.



gcc/
* ira-color.c (check_hard_reg_p): Restrict the profitability check
to the first register.
(setup_allocno_available_regs_num): Likewise.

Index: gcc/ira-color.c
===
--- gcc/ira-color.c (revision 171653)
+++ gcc/ira-color.c (working copy)
@@ -1497,7 +1504,8 @@ check_hard_reg_p (ira_allocno_t a, int h
for (k = set_to_test_start; k<  set_to_test_end; k++)
/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
-   || ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
+   || (j == 0
+   &&  ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno)))
  break;
if (k != set_to_test_end)
break;
@@ -2226,8 +2234,9 @@ setup_allocno_available_regs_num (ira_al
  /* Checking only profitable hard regs.  */
  if (TEST_HAR

Re: ira-improv patch has been committed

2011-03-30 Thread H.J. Lu
On Mon, Mar 28, 2011 at 6:11 PM, Vladimir Makarov  wrote:
> On 03/27/2011 07:25 PM, Vladimir Makarov wrote:
>>
>>  I submitted the following patch.  The patch contains original
>> patches from ira-improv branch, changes addressing all Keneth Zadeck's
>> comments in http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00457.html,
>> and a minor change fixing few regression on gcc testsuite for x86/x86-64
>> and
>> ppc.
>>
>>  The patch was successfully tested and bootstrapped on x86/x86-64,
>> ppc64, and ia64.
>>
> Here is another try to commit the patch.  The previous bootstrap failure was
> because of wrong assertion which fails in extremely rare case (multi-object
> allocnos which got hard reg set nodes different from the allocno profitable
> hard regs and the corresponding node hard regs sets happened not be
> intersected).  The fix is very small and described by the first ChangeLog
> entry.
>
> The patch was bootstrapped on x86/x86-64, ppc64, ia64, and i686 with options
> of H.J.'s automatic tester.
>
> Committed as revision 171649.
>
> 2011-03-28  Vladimir Makarov 
>
>    * ira-color.c (update_left_conflict_sizes_p): Don't assume that
>    conflict object hard regset nodes have intersecting hard reg sets.
>
>    * regmove.c (regmove_optimize): Move ira_set_pseudo_classes call
>    after regstat_init_n_sets_and_refs.
>
>    * ira.c: Add more comments at the top.
>    (setup_stack_reg_pressure_class, setup_pressure_classes):
>    Add comments how we compute the register pressure classes.
>    (setup_allocno_and_important_classes): Add more comments.
>    (setup_class_translate_array, reorder_important_classes)
>    (setup_reg_class_relations): Add comments.
>
>    * ira-emit.c: Add 2011 to the Copyright line.  Add comments at the
>    start of the file.
>
>    * ira-color.c: Add 2011 to the Copyright line.
>    (assign_hard_reg):  Add more comments.
>    (improve_allocation): Ditto.
>
>    * ira-costs.c: Add 2011 to the Copyright line.
>    (setup_cost_classes, setup_regno_cost_classes_by_aclass): Add more
>    comments.
>    (setup_regno_cost_classes_by_mode): Ditto.
>
>    Initial patches from ira-improv branch:
>
>    2010-08-13  Vladimir Makarov 
>
>    * ira-build.c: (ira_create_object): Remove initialization of
>    OBJECT_PROFITABLE_HARD_REGS.  Initialize OBJECT_ADD_DATA.
>    (ira_create_allocno): Remove initialization of
>    ALLOCNO_MEM_OPTIMIZED_DEST, ALLOCNO_MEM_OPTIMIZED_DEST_P,
>    ALLOCNO_SOMEWHERE_RENAMED_P, ALLOCNO_CHILD_RENAMED_P,
>    ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P,
>    ALLOCNO_NEXT_BUCKET_ALLOCNO, ALLOCNO_PREV_BUCKET_ALLOCNO,
>    ALLOCNO_FIRST_COALESCED_ALLOCNO, ALLOCNO_NEXT_COALESCED_ALLOCNO.
>    Initialize ALLOCNO_ADD_DATA.
>    (copy_info_to_removed_store_destinations): Use ALLOCNO_EMIT_DATA
>    and allocno_emit_reg instead of ALLOCNO_MEM_OPTIMIZED_DEST_P and
>    ALLOCNO_REG.
>    (ira_flattening): Ditto.  Use ALLOCNO_EMIT_DATA instead of
>    ALLOCNO_MEM_OPTIMIZED_DEST and ALLOCNO_SOMEWHERE_RENAMED_P.
>
>    * ira.c (ira_reallocate): Remove.
>    (setup_pressure_classes): Call
>    ira_init_register_move_cost_if_necessary.  Use
>    ira_register_move_cost instead of ira_get_register_move_cost.
>    (setup_allocno_assignment_flags): Use ALLOCNO_EMIT_DATA.
>    (ira): Call ira_initiate_emit_data and ira_finish_emit_data.
>
>    * ira-color.c: Use ALLOCNO_COLOR_DATA instead of
>    ALLOCNO_IN_GRAPH_P, ALLOCNO_MAY_BE_SPILLED_P, ALLOCNO_COLORABLE_P,
>    ALLOCNO_AVAILABLE_REGS_NUM, ALLOCNO_NEXT_BUCKET_ALLOCNO,
>    ALLOCNO_PREV_BUCKET_ALLOCNO. ALLOCNO_TEMP. Use OBJECT_COLOR_DATA
>    instead of OBJECT_PROFITABLE_HARD_REGS, OBJECT_HARD_REGS_NODE,
>    OBJECT_HARD_REGS_SUBNODES_START, OBJECT_HARD_REGS_SUBNODES_NUM.
>    Fix formatting.
>    (object_hard_regs_t, object_hard_regs_node_t): Move from
>    ira-int.h.
>    (struct object_hard_regs, struct object_hard_regs_node): Ditto.
>    (struct allocno_color_data): New.
>    (allocno_color_data_t): New typedef.
>    (allocno_color_data): New definition.
>    (ALLOCNO_COLOR_DATA): New macro.
>    (struct object_color_data): New.
>    (object_color_data_t): New typedef.
>    (object_color_data): New definition.
>    (OBJECT_COLOR_DATA): New macro.
>    (update_copy_costs, calculate_allocno_spill_cost): Call
>    ira_init_register_move_cost_if_necessary.  Use
>    ira_register_move_cost instead of ira_get_register_move_cost.
>    (move_spill_restore, update_curr_costs): Ditto.
>    (allocno_spill_priority): Make it inline.
>    (color_pass): Allocate and free allocno_color_dat and
>    object_color_data.
>    (struct coalesce_data, coalesce_data_t): New.
>    (allocno_coalesce_data): New definition.
>    (ALLOCNO_COALESCE_DATA): New macro.
>    (merge_allocnos, coalesced_allocno_conflict_p): Use
>    ALLOCNO_COALESCED_DATA instead of ALLOCNO_FIRST_COALESCED_ALLOCNO,
>    ALLOCNO_NEXT_COALESCED_ALLOCNO, ALLOCNO_TEMP.
>    (coalesce_allocnos): Ditto.
>    (setup_coalesced_allocno_costs_and_nums): Ditto.
>    (c

Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Michael Matz
Hello,

I noticed this some time ago already, but it went away.  Now it reoccured 
and made me suspicious.  "It" being random ICEs in fold_convert_loc, or 
segfaults, sometimes reproducable, sometimes not, on the 
realloc_on_assign_2.f03 testcase.

The issue is that the frontend leaks the address of a local variable to 
outside its containing scope.  Depending on the phase of moon, 
optimization factor and the like this stack space then is or is not 
overwritten at the use site.  For me loop->to[0] was overwritten with a 
type node, ultimately ICEing in fold_convert when it tries to convert a 
type to a, well, type :)

I haven't audited all users of gfc_copy_loopinfo_to_se, many seem to store 
the address of a local variable into the se structure.  But those I looked 
at seem to be safe, in the sense of not using the se structure 
outside places where loop is also defined.  Except for this one use I'm 
patching below, by moving the loop variable to the place that contains 
also the se structure.

Okay for trunk?  (regstrapping on x86_64-linux in progress)


Ciao,
Michael.
-- 
* trans-expr.c (realloc_lhs_loop_for_fcn_call): Take loop as parameter,
don't use local variable.
(gfc_trans_arrayfunc_assign): Adjust caller.

Index: fortran/trans-expr.c
===
--- fortran/trans-expr.c(revision 171734)
+++ fortran/trans-expr.c(working copy)
@@ -5510,20 +5512,20 @@ arrayfunc_assign_needs_temporary (gfc_ex
reallocatable assignments from extrinsic function calls.  */
 
 static void
-realloc_lhs_loop_for_fcn_call (gfc_se *se, locus *where, gfc_ss **ss)
+realloc_lhs_loop_for_fcn_call (gfc_se *se, locus *where, gfc_ss **ss,
+  gfc_loopinfo *loop)
 {
-  gfc_loopinfo loop;
   /* Signal that the function call should not be made by
  gfc_conv_loop_setup. */
   se->ss->is_alloc_lhs = 1;
-  gfc_init_loopinfo (&loop);
-  gfc_add_ss_to_loop (&loop, *ss);
-  gfc_add_ss_to_loop (&loop, se->ss);
-  gfc_conv_ss_startstride (&loop);
-  gfc_conv_loop_setup (&loop, where);
-  gfc_copy_loopinfo_to_se (se, &loop);
-  gfc_add_block_to_block (&se->pre, &loop.pre);
-  gfc_add_block_to_block (&se->pre, &loop.post);
+  gfc_init_loopinfo (loop);
+  gfc_add_ss_to_loop (loop, *ss);
+  gfc_add_ss_to_loop (loop, se->ss);
+  gfc_conv_ss_startstride (loop);
+  gfc_conv_loop_setup (loop, where);
+  gfc_copy_loopinfo_to_se (se, loop);
+  gfc_add_block_to_block (&se->pre, &loop->pre);
+  gfc_add_block_to_block (&se->pre, &loop->post);
   se->ss->is_alloc_lhs = 0;
 }
 
@@ -5591,6 +5593,7 @@ gfc_trans_arrayfunc_assign (gfc_expr * e
   gfc_se se;
   gfc_ss *ss;
   gfc_component *comp = NULL;
+  gfc_loopinfo loop;
 
   if (arrayfunc_assign_needs_temporary (expr1, expr2))
 return NULL;
@@ -5641,7 +5644,7 @@ gfc_trans_arrayfunc_assign (gfc_expr * e
 {
   if (!expr2->value.function.isym)
{
- realloc_lhs_loop_for_fcn_call (&se, &expr1->where, &ss);
+ realloc_lhs_loop_for_fcn_call (&se, &expr1->where, &ss, &loop);
  ss->is_alloc_lhs = 1;
}
   else


A small patch to fix an arm bootstrap failure

2011-03-30 Thread Vladimir Makarov

The following patch is to fix an arm bootstrap failure described in

http://gcc.gnu.org/ml/gcc/2011-03/msg00499.html

when variable mode is set but not used because it is used when macro 
HONOR_REG_ALLOC_ORDER is defined.


I found that 2 variables mode in different scopes is defined in function 
assign_hard_reg.  Using only one variable will guarantee usage of the 
variable.


The patch has been committed as obvious after successful bootstrap on 
x86_64.



2011-03-30  Vladimir Makarov 

* ira-color.c (ira_assign_hard_reg): Use only one variable 'mode'.

Index: ira-color.c
===
--- ira-color.c (revision 171713)
+++ ira-color.c (working copy)
@@ -1622,10 +1622,11 @@ assign_hard_reg (ira_allocno_t a, bool r
  if (hard_regno >= 0
  && ira_class_hard_reg_index[aclass][hard_regno] >= 0)
{
- enum machine_mode mode = ALLOCNO_MODE (conflict_a);
- int conflict_nregs = hard_regno_nregs[hard_regno][mode];
  int n_objects = ALLOCNO_NUM_OBJECTS (conflict_a);
+ int conflict_nregs;
 
+ mode = ALLOCNO_MODE (conflict_a);
+ conflict_nregs = hard_regno_nregs[hard_regno][mode];
  if (conflict_nregs == n_objects && conflict_nregs > 1)
{
  int num = OBJECT_SUBWORD (conflict_obj);


[PATCH] Fix expansion issues on type changing MEM_REFs on LHS (PR middle-end/48335)

2011-03-30 Thread Jakub Jelinek
Hi!

MEM_REFs which can represent type punning on lhs don't force
non-gimple types to be addressable.  This causes various problems
in the expander, which wasn't prepared to handle that.

This patch tries to fix what I've found and adds a bunch of
testcases.  The original report was with just -O2 on some large testcase
from Eigen, most of the testcases have -fno-tree-sra just because
I've given up on delta when it stopped reducing at 32KB.

The first problem (the one from Eigen) is _mm_store_pd into
a std::complex var, which is a single field and thus
has DCmode TYPE_MODE.  As starting with 4.6 that var is not
TREE_ADDRESSABLE, its DECL_RTL is a CONCAT, and for assignments
to concat expand_assignment was expecting either that
from has COMPLEX_TYPE (and matching mode to the store), or
that it is a real or imaginary subpart store, thus when
trying to store a V2DF mode value it expected it to be
real part store (bitpos == 0) and tried to set a DFmode pseudo
from V2DFmode rtx.
Further testing revealed that it is possible to hit many other
cases with CONCAT destination, it can be a store of just a few bits,
or can overlap parts of both real and imaginary, or be partially
out of bounds.
The patch handles the case from Eigen - bitpos == 0 bitsize ==
GET_MODE_BITSIZE (GET_MODE (to_rtx)) non-COMPLEX_TYPE by setting
each half separately, if it is a store which is not touching
one of the parts by just adjusting bitpos for the imaginary
case and storing just to one of the parts (this is
the bitpos + bitsize < half_bitsize resp. bitpos >= half_bitsize
case) and finally adds a generic slow one for the very unusual
cases with partial overlap of both.

After testing it with the testcases I wrote, I found a bunch of
other ICEs though, and reproduced them even without CONCAT
on the LHS (the testcases below which don't contain any _Complex
keyword).

Bootstrapped/regtested on x86_64-linux and i686-linux,
regtested with a cross compiler on these new testcases also
for powerpc{,64}-linux and s390{,x}-linux.

Ok for trunk and after a while for 4.6?

2011-03-30  Jakub Jelinek  

PR middle-end/48335
* expr.c (expand_assignment): Handle all possibilities
if TO_RTX is CONCAT.
* expmed.c (store_bit_field_1): Avoid trying to create
invalid SUBREGs.
(store_split_bit_field): If SUBREG_REG (op0) or
op0 itself has smaller mode than word, return it
for offset 0 and const0_rtx for out of bounds stores.
If word is const0_rtx, skip it.

* gcc.c-torture/compile/pr48335-1.c: New test.
* gcc.dg/pr48335-1.c: New test.
* gcc.dg/pr48335-2.c: New test.
* gcc.dg/pr48335-3.c: New test.
* gcc.dg/pr48335-4.c: New test.
* gcc.dg/pr48335-5.c: New test.
* gcc.dg/pr48335-6.c: New test.
* gcc.dg/pr48335-7.c: New test.
* gcc.dg/pr48335-8.c: New test.
* gcc.target/i386/pr48335-1.c: New test.

--- gcc/expr.c.jj   2011-03-23 17:15:55.0 +0100
+++ gcc/expr.c  2011-03-30 11:38:15.0 +0200
@@ -4278,16 +4278,47 @@ expand_assignment (tree to, tree from, b
   /* Handle expand_expr of a complex value returning a CONCAT.  */
   else if (GET_CODE (to_rtx) == CONCAT)
{
- if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from
+ unsigned short mode_bitsize = GET_MODE_BITSIZE (GET_MODE (to_rtx));
+ if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))
+ && bitpos == 0
+ && bitsize == mode_bitsize)
+   result = store_expr (from, to_rtx, false, nontemporal);
+ else if (bitsize == mode_bitsize / 2
+  && (bitpos == 0 || bitpos == GET_MODE_BITSIZE (mode1)))
+   result = store_expr (from, XEXP (to_rtx, bitpos != 0), false,
+nontemporal);
+ else if (bitpos + bitsize <= mode_bitsize / 2)
+   result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ mode1, from, TREE_TYPE (tem),
+ get_alias_set (to), nontemporal);
+ else if (bitpos >= mode_bitsize / 2)
+   result = store_field (XEXP (to_rtx, 1), bitsize,
+ bitpos - mode_bitsize / 2, mode1, from,
+ TREE_TYPE (tem), get_alias_set (to),
+ nontemporal);
+ else if (bitpos == 0 && bitsize == mode_bitsize)
{
- gcc_assert (bitpos == 0);
- result = store_expr (from, to_rtx, false, nontemporal);
+ rtx from_rtx = expand_normal (from);
+ from_rtx = simplify_gen_subreg (GET_MODE (to_rtx), from_rtx,
+ TYPE_MODE (TREE_TYPE (from)), 0);
+ emit_move_insn (XEXP (to_rtx, 0),
+ read_complex_part (from_rtx, false));
+ emit_move_insn (XEXP (to_rtx, 1),
+ read_complex_part (from_rtx, t

[PATCH] Make renumber_gimple_stmt_uids and renumber_gimple_stmt_uids_in_bbs consistent

2011-03-30 Thread Richard Guenther

As people noted one computes UIDs for PHIs and one doesn't.  Now I'm
working on sth that needs it for PHIs, thus the following fix.
We have to exclude LTO as it re-builds PHIs and gets confused with
inconsistent numbering.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-03-30  Richard Guenther  

* tree-dfa.c (renumber_gimple_stmt_uids): Also number PHIs.
* lto-streamer-out.c (output_function): Do not use
renumber_gimple_stmt_uids.
* lto-streamer-in.c (input_function): Likewise.

Index: gcc/tree-dfa.c
===
--- gcc/tree-dfa.c  (revision 171716)
+++ gcc/tree-dfa.c  (working copy)
@@ -151,6 +151,11 @@ renumber_gimple_stmt_uids (void)
   FOR_ALL_BB (bb)
 {
   gimple_stmt_iterator bsi;
+  for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+   {
+ gimple stmt = gsi_stmt (bsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
   for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
  gimple stmt = gsi_stmt (bsi);
Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 171716)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -1981,8 +2031,19 @@ output_function (struct cgraph_node *nod
   /* We will renumber the statements.  The code that does this uses
  the same ordering that we use for serializing them so we can use
  the same code on the other end and not have to write out the
- statement numbers.  */
-  renumber_gimple_stmt_uids ();
+ statement numbers.  We do not assign UIDs to PHIs here because
+ virtual PHIs get re-computed on-the-fly which would make numbers
+ inconsistent.  */
+  set_gimple_stmt_max_uid (cfun, 0);
+  FOR_ALL_BB (bb)
+{
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
+}
 
   /* Output the code for the function.  */
   FOR_ALL_BB_FN (bb, fn)
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c   (revision 171716)
+++ gcc/lto-streamer-in.c   (working copy)
@@ -1287,7 +1287,16 @@ input_function (tree fn_decl, struct dat
 
   /* Fix up the call statements that are mentioned in the callgraph
  edges.  */
-  renumber_gimple_stmt_uids ();
+  set_gimple_stmt_max_uid (cfun, 0);
+  FOR_ALL_BB (bb)
+{
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+   }
+}
   stmts = (gimple *) xcalloc (gimple_stmt_max_uid (fn), sizeof (gimple));
   FOR_ALL_BB (bb)
 {



libgo patch committed: Update to current Go library

2011-03-30 Thread Ian Lance Taylor
I have updated libgo to the current master Go library.  The main reason
for this is to bring in some new code which uses nonblocking connect.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian



foo.patch.bz2
Description: patch


Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Joseph S. Myers
On Wed, 30 Mar 2011, Richard Earnshaw wrote:

> > 2011-03-29  Joseph Myers  
> > 
> 
> > (arm-*-coff): Don't disable libgcj.
> > (arm*-*-linux-gnueabi): Remove useless assignment.
> > (arm-*-riscix*): Don't disable libgcj.
> 
> 
> RISC iX support was removed from GCC years ago.  Looks like a tiny
> fragment left over that wasn't cleaned up.  The other bits are fine with
> me.

The RISC iX case is much like the Solaris/PowerPC one I commented on in 
: it appears 
there is still assembler support.  What I'm doing here is simply cleaning 
things up by removing mentions of *GCC* directories for targets without 
GCC support (often ones where GCC support was removed long ago).

It's quite possible that some obsolete targets should have their binutils 
(and newlib) support removed, leading to further toplevel simplification 
(gdb has been more active in pruning unmaintained targets - but it's also 
the case that a proliferation of targets is more likely to cause problems 
there than in binutils and newlib).  But I think it's really for the 
binutils maintainers to work out what they think is worth removing.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][ARM] Limit thumb2 test to thumb2 targets

2011-03-30 Thread Richard Earnshaw

On Thu, 2011-02-10 at 15:29 +, Andrew Stubbs wrote:
> I'm posting this patch on behalf of Jie.
> 
> The patch simply prevents a test failure on non-thumb2 targets.
> 
> OK?
> 
> Andrew


2010-09-30  Jie Zhang  

gcc/testsuite/
* gcc.target/arm/neon-thumb2-move.c: Add
dg-require-effective-target arm_thumb2_ok.

Just in case this never got checked in, OK.

R.




Re: [PATCH][ARM] Tweak arm_class_likely_spilled_p, MODE_BASE_REG_CLASS for Thumb-2

2011-03-30 Thread Richard Earnshaw

On Mon, 2011-02-14 at 14:20 +, Andrew Stubbs wrote:
> This patch is a rework of an old one:
> 
> http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01080.html
> 
> The ARM parts of that patch were approved, but the target independent 
> parts were never reviewed (AFAICT), and the patch no longer applies.
> 
> I've updated the target-specific parts. As far as I can tell, the target 
> independent parts are no longer required, so I've dropped them.
> 
> Tested with no regressions for ARM mode and Thumb2 mode.
> 
> OK?
> 
> Andrew

2011-02-14  Andrew Stubbs  
Julian Brown  
Mark Shinwell  

gcc/
* config/arm/arm.h (arm_class_likely_spilled_p): Check against
LO_REGS only for Thumb-1.
(MODE_BASE_REG_CLASS): Restrict base registers to those which can
be used in short instructions when optimising for size on Thumb-2.

OK.

R.





Re: Cleaning up expand optabs code

2011-03-30 Thread Richard Henderson
On 03/30/2011 01:53 AM, Richard Sandiford wrote:
> This comes back to the point that we really should know up-front what
> modes op0 and op1 actually have.  (The thing I left as a future clean-up.)
> Until then, the process implemented by yesterday's patch was supposed to be:
> 
>- work out what mode opN actually has at this point in time
>- see if the target has specifically asked for a different mode
>- if so, convert the operand

Ok, I get it.  The patch is ok as-is.


r~


Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Mike Stump
On Mar 30, 2011, at 7:40 AM, Richard Guenther wrote:
>> Is forcing word-alignment too big of a hammer, or will the users for these
>> architectures be content with having no support for the C++0x memory model?
> 
> I think a memory model that cannot be reasonably (read: also fast) implemented
> on all HW is screwed from the start and we should simply ditch it.  Which
> is because nobody will use it as you cannot rely on it when writing
> portable programs or it will be hell slow.

I agree 100%.  If the standards people can't write a decent standard, they 
ought not write it.  I torpedoed someone refining volatile, which would have 
been nice to have, because people were laying tracks down the wrong way.  Nuke 
em from orbit I say.  Now, I'm sure we have it all wrong and the standard is 
entirely reasonable...  right?


Re: PING [patch, testsuite] Fix a short-enums testsuite error

2011-03-30 Thread Richard Earnshaw

On Fri, 2011-02-18 at 10:56 +, Ian Bolton wrote:
> > Is the following patch OK for trunk?
> > 
> > 
> > 2011-01-26  Ian Bolton  
> > 
> > * testsuite/gcc.dg/20100906-1.c: Use -fno-short-enums
> > option for "target arm_eabi".
> > 
> > 
> > Index: gcc/testsuite/gcc.dg/20100906-1.c
> > ===
> > --- gcc/testsuite/gcc.dg/20100906-1.c   (revision 168543)
> > +++ gcc/testsuite/gcc.dg/20100906-1.c   (working copy)
> > @@ -1,5 +1,6 @@
> >  /* { dg-do run } */
> >  /* { dg-options "-O2" } */
> > +/* { dg-options "-O2 -fno-short-enums -Wl,--no-enum-size-warning"
> > {target
> > arm_eabi} } */
> > 
> >  /* This testcase got misoptimized by combine due to a wrong setting of
> > subst_low_luid in try_combine.  */
> 
> PING.
> 
> 

OK.

R.




RFC: PATCH: Remove 26 element limit in XVECEXP

2011-03-30 Thread H.J. Lu
Hi,

Currently, we limit XVECEXP to 26 elements in machine description
since we use letters 'a' to 'z' to encode them.  I don't see any
reason why we can't go beyond 'z'.  This patch removes this restriction.
Any comments?

Thanks.


H.J.
---
2011-03-30  H.J. Lu  

* genextract.c (gen_insn): Use encode_xvecexp.
(walk_rtx): Likewise.
(print_path): Use is_xvecexp and decode_xvecexp.
* genpreds.c (write_extract_subexp): Likewise.

* genrecog.c (add_to_sequence): Add assert.  Use encode_xvecexp.
(change_state): Use is_xvecexp and decode_xvecexp.

* gensupport.h (is_xvecexp): New.
(encode_xvecexp): Likewise.
(decode_xvecexp): Likewise.

diff --git a/gcc/genextract.c b/gcc/genextract.c
index 09e7cde..2afeec0 100644
--- a/gcc/genextract.c
+++ b/gcc/genextract.c
@@ -108,7 +108,7 @@ gen_insn (rtx insn, int insn_code_number)
   else
 for (i = XVECLEN (insn, 1) - 1; i >= 0; i--)
   {
-   VEC_safe_push (char,heap, acc.pathstr, 'a' + i);
+   VEC_safe_push (char,heap, acc.pathstr, encode_xvecexp (i));
walk_rtx (XVECEXP (insn, 1, i), &acc);
VEC_pop (char, acc.pathstr);
   }
@@ -222,7 +222,7 @@ static void
 walk_rtx (rtx x, struct accum_extract *acc)
 {
   RTX_CODE code;
-  int i, len, base;
+  int i, len, val;
   const char *fmt;
 
   if (x == 0)
@@ -248,10 +248,13 @@ walk_rtx (rtx x, struct accum_extract *acc)
   VEC_safe_set_locstr (&acc->oplocs, XINT (x, 0),
   VEC_char_to_string (acc->pathstr));
 
-  base = (code == MATCH_OPERATOR ? '0' : 'a');
   for (i = XVECLEN (x, 2) - 1; i >= 0; i--)
{
- VEC_safe_push (char,heap, acc->pathstr, base + i);
+ if (code == MATCH_OPERATOR)
+   val = '0' + i;
+ else
+   val = encode_xvecexp (i);
+ VEC_safe_push (char,heap, acc->pathstr, val);
  walk_rtx (XVECEXP (x, 2, i), acc);
  VEC_pop (char, acc->pathstr);
 }
@@ -267,10 +270,13 @@ walk_rtx (rtx x, struct accum_extract *acc)
   if (code == MATCH_DUP)
break;
 
-  base = (code == MATCH_OP_DUP ? '0' : 'a');
   for (i = XVECLEN (x, 1) - 1; i >= 0; i--)
 {
- VEC_safe_push (char,heap, acc->pathstr, base + i);
+ if (code == MATCH_OP_DUP)
+   val = '0' + i;
+ else
+   val = encode_xvecexp (i);
+ VEC_safe_push (char,heap, acc->pathstr, val);
  walk_rtx (XVECEXP (x, 1, i), acc);
  VEC_pop (char, acc->pathstr);
 }
@@ -295,7 +301,8 @@ walk_rtx (rtx x, struct accum_extract *acc)
  int j;
  for (j = XVECLEN (x, i) - 1; j >= 0; j--)
{
- VEC_safe_push (char,heap, acc->pathstr, 'a' + j);
+ VEC_safe_push (char,heap, acc->pathstr,
+encode_xvecexp (j));
  walk_rtx (XVECEXP (x, i, j), acc);
  VEC_pop (char, acc->pathstr);
}
@@ -326,7 +333,7 @@ print_path (const char *path)
 
   for (i = len - 1; i >= 0 ; i--)
 {
-  if (ISLOWER (path[i]))
+  if (is_xvecexp (path[i]))
fputs ("XVECEXP (", stdout);
   else if (ISDIGIT (path[i]))
fputs ("XEXP (", stdout);
@@ -338,8 +345,8 @@ print_path (const char *path)
 
   for (i = 0; i < len; i++)
 {
-  if (ISLOWER (path[i]))
-   printf (", 0, %d)", path[i] - 'a');
+  if (is_xvecexp (path[i]))
+   printf (", 0, %d)", decode_xvecexp (path[i]));
   else if (ISDIGIT(path[i]))
printf (", %d)", path[i] - '0');
   else
diff --git a/gcc/genpreds.c b/gcc/genpreds.c
index fba4372..d5e1ddc 100644
--- a/gcc/genpreds.c
+++ b/gcc/genpreds.c
@@ -433,7 +433,7 @@ write_extract_subexp (const char *path)
  order, then write "op", then the indices in forward order.  */
   for (i = len - 1; i >= 0; i--)
 {
-  if (ISLOWER (path[i]))
+  if (is_xvecexp (path[i]))
fputs ("XVECEXP (", stdout);
   else if (ISDIGIT (path[i]))
fputs ("XEXP (", stdout);
@@ -445,8 +445,8 @@ write_extract_subexp (const char *path)
 
   for (i = 0; i < len; i++)
 {
-  if (ISLOWER (path[i]))
-   printf (", 0, %d)", path[i] - 'a');
+  if (is_xvecexp (path[i]))
+   printf (", 0, %d)", decode_xvecexp (path[i]));
   else if (ISDIGIT (path[i]))
printf (", %d)", path[i] - '0');
   else
diff --git a/gcc/genrecog.c b/gcc/genrecog.c
index 74dd0a7..b65eb4f 100644
--- a/gcc/genrecog.c
+++ b/gcc/genrecog.c
@@ -912,6 +912,7 @@ add_to_sequence (rtx pattern, struct decision_head *last, 
const char *position,
  /* Which insn we're looking at is represented by A-Z. We don't
 ever use 'A', however; it is always implied.  */
 
+ gcc_assert (i < 26);
  subpos[depth] = (i > 0 ? 'A' + i : 0);
  sub = add_to_sequence (XVECEXP (pattern, 0, i),
 last, subpos, insn_type, 0);
@@ -999,10 +1000,17 @@ add_to_sequence (rtx pattern, s

Re: [ARM] Define unspecs using define_c_enum

2011-03-30 Thread Richard Sandiford
Richard Earnshaw  writes:
> On Tue, 2011-03-29 at 15:38 +0100, Richard Sandiford wrote:
>> This ARM patch allows *UNSPEC_* constants to be printed in dump files.
>> It's very much a target decision whether this is worth doing, but just
>> in case...
>> 
>> ...tested on arm-linux-gnueabi.  OK to install?
>> 
>> Richard
>> 
>> 
>> gcc/
>>  * config/arm/neon.md: Use define_c_enum to define UNSPEC* and
>>  VUNSPEC* constants.
>>  * config/arm/arm.md: Likewise.  Move the synchronization ones to...
>>  * config/arm/sync.md: ...here.
>> 
>
>
> Hmm, seems we have two patches posted to do this.  I've just approved an
> earlier post that came in while we were in pre-release lock-down.

Oops, sorry, I missed Yufeng's patch.  FWIW, the differences in my patch
are that I've got rid of the (documented as) unused UNSPEC_SIN and
UNSPEC_COS and moved the sync stuff to sync.md.  I assumed the old
code did things that way in order to make the numbering more obvious,
but that isn't an issue any more.

Richard


Re: Move reg_equiv* arrays into a single VEC structure

2011-03-30 Thread Richard Guenther
On Wed, Mar 30, 2011 at 4:23 PM, Jeff Law  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> As originally discussed here:
>
> http://gcc.gnu.org/ml/gcc-patches/2010-05/msg00987.html
>
> Changes since the original submission:
>
> Per Richi's suggestion I changed the patch to use a GC'd VEC.  I defined
> accessor macros so that a ton of reformatting isn't needed and finally I
> made the appropriate tweaks to the few backends that peeked at the
> reg_equiv arrays.
>
> Given the changes and the length of time since the original submission,
> I think it's probably best to get a fresh approval rather than rely on
> the prior approval.

Looks good to me.

Thanks,
Richard.

> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJNkzznAAoJEBRtltQi2kC7VgYH/0yTkwWkDRyalReMk3whdIXO
> 8qw5H6c9Gz+Yj+EnnPySKsFIvJKMBTQHIpdCzjTCVWD/Z7LSJwERzzlNCPrQu2au
> dpOoUYCTAwgSW0Us9B+2Bcf2DABinYLV+hgKAKFEVi98CheZe3hZZ14lm5mlYDec
> INYKyfqYHmyahT8fa6ABY2kp0X2xhQhJ0VAGPI34kytJpgLpIdtRwq6PsdsPM0PJ
> frLAY5xIEmJqBB30RaPqnD07u06xZHi+S9gfAJa4LJTUqVALNusYdZzZajKMtF3i
> BCXK4UFk+J2MlM9xZkVWqQiryLc6arVT2bMQcvz7tXTMZkY6ZdZ7sKFBgKea6vM=
> =iMqQ
> -END PGP SIGNATURE-
>


Re: fix for 48208 and 48260 on darwin

2011-03-30 Thread Mike Stump
On Mar 30, 2011, at 5:41 AM, gcchelp.5.ad...@0sg.net wrote:
> Tests that now fail, but worked before:
> 
> g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, line 7)
> 
> Tests that now work, but didn't before:
> 
> g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, line 7)

Yeah, someone thought it would be funny to have two test cases with the same 
exact name, and then to have one of them fail and the other one work, so, 
technically, what it said is true.

I've now applied the patch for you.  In the future, by sure to attach the diffs 
as a file, as Mail eats white spacing in odd ways which corrupts patches.

Thanks.


2011-03-30  Christian Schüler  

PR/driver 48208
* config/c.opt (F): Added 'Driver' to -F option.
PR/driver 48260
* config/darwin-driver.c (darwin_driver_init): Add '-arch' to handler 
function.
* config/darwin.opt: Added '-arch' option.

Index: c-family/c.opt
===
--- c-family/c.opt  (revision 171691)
+++ c-family/c.opt  (working copy)
@@ -201,7 +201,7 @@
 C ObjC C++ ObjC++ Undocumented Var(flag_preprocess_only)
 
 F
-C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after %qs)
+Driver C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after 
%qs)
 -FAdd  to the end of the main framework include path
 
 H
Index: config/darwin.opt
===
--- config/darwin.opt   (revision 171691)
+++ config/darwin.opt   (working copy)
@@ -31,6 +31,9 @@
 allowable_client
 Driver Separate Alias(Zallowable_client)
 
+arch
+Driver RejectNegative Separate
+
 arch_errors_fatal
 Driver Alias(Zarch_errors_fatal)
 
Index: config/darwin-driver.c
===
--- config/darwin-driver.c  (revision 171691)
+++ config/darwin-driver.c  (working copy)
@@ -161,6 +161,15 @@
continue;
   switch ((*decoded_options)[i].opt_index)
{
+#if DARWIN_X86
+   case OPT_arch:
+ if (!strcmp ((*decoded_options)[i].arg, "i386"))
+   generate_option (OPT_m32, NULL, 1, CL_DRIVER, 
&(*decoded_options)[i]);
+ else if (!strcmp ((*decoded_options)[i].arg, "x86_64"))
+   generate_option (OPT_m64, NULL, 1, CL_DRIVER, 
&(*decoded_options)[i]);
+ break;
+#endif
+
case OPT_filelist:
case OPT_framework:
  ++*decoded_options_count;





Re: [ARM] Define unspecs using define_c_enum

2011-03-30 Thread Richard Earnshaw

On Tue, 2011-03-29 at 15:38 +0100, Richard Sandiford wrote:
> This ARM patch allows *UNSPEC_* constants to be printed in dump files.
> It's very much a target decision whether this is worth doing, but just
> in case...
> 
> ...tested on arm-linux-gnueabi.  OK to install?
> 
> Richard
> 
> 
> gcc/
>   * config/arm/neon.md: Use define_c_enum to define UNSPEC* and
>   VUNSPEC* constants.
>   * config/arm/arm.md: Likewise.  Move the synchronization ones to...
>   * config/arm/sync.md: ...here.
> 


Hmm, seems we have two patches posted to do this.  I've just approved an
earlier post that came in while we were in pre-release lock-down.

R.




[PATCH, Fortran] Correct declaration of frexp and friends

2011-03-30 Thread Duncan Sands

While working on the dragonegg plugin I noticed that the Fortran front-end
declares frexp with the parameters the wrong way round.  Instead of
  double frexp(double x, int *exp);
it is declared as
  double frexp(int *exp, double x);
This is fairly harmless but might as well be fixed, so here is a patch (as far
as I can see fntype[4] is only used in declaring the frexp family of functions).
Bootstraps and has no impact on the Fortran testsuite (tested on mainline).  OK
to apply on mainline and the 4.5 and 4.6 branches?

Proposed fortran/Changelog entry:
2011-03-30  Duncan Sands  

* f95-lang.c (build_builtin_fntypes): Swap frexp parameter types.


Index: gcc/fortran/f95-lang.c
===
--- gcc/fortran/f95-lang.c  (revision 171716)
+++ gcc/fortran/f95-lang.c  (working copy)
@@ -695,10 +695,9 @@
 type, integer_type_node, NULL_TREE);
   /* type (*) (void) */
   fntype[3] = build_function_type_list (type, NULL_TREE);
-  /* type (*) (&int, type) */
-  fntype[4] = build_function_type_list (type,
+  /* type (*) (type, &int) */
+  fntype[4] = build_function_type_list (type, type,
 build_pointer_type (integer_type_node),
-type,
 NULL_TREE);
   /* type (*) (int, type) */
   fntype[5] = build_function_type_list (type,


[Patch] bfin: fix profiling

2011-03-30 Thread Henderson, Stuart
The attached patch allows long jumps to __mcount, defines 
PROFILE_BEFORE_PROLOGUE and ensures ASM_OUTPUT_REG_PUSH pre-decrements the 
stack pointer.


2011-03-30  Stuart Henderson  

From Bernd Schmidt
* config/bfin/bfin.h (FUNCTION_PROFILER): Take TARGET_LONG_CALLS into
account and save/restore RETS.
(PROFILE_BEFORE_PROLOGUE): Define.
(ASM_OUTPUT_REG_PUSH, ASM_OUTPUT_REG_POP): Add tab character.  Correct
the push insn to use predecrement.


Thanks,
Stu


upstream.patch
Description: upstream.patch


Re: [PATCH: ARM] Replace define_constants with define_c_enum for unspec/unspec_volatile in backend

2011-03-30 Thread Richard Earnshaw

On Tue, 2011-01-18 at 16:56 +, Yufeng Zhang wrote:
> This patch replaces define_constants in the ARM backend with
> define_c_enum for defining the available indexes for the
> unspecs/unspecvs expressions. This improves the readability of the
> intermediate dumps for machine-specific operations. For instance, given
> the following c code:
> 
> int foo (char* s);
> extern char hello[];
> 
> void test(int x)
> {
>   int y = x + 4;
>   foo(hello);
> }
> 
> If compiled with -c -fpic, the diff of the -fdump-rtl-expand dump
> between pre-patch and post-patch is:
> 
> @@ -24,10 +24,10 @@
>  (const:SI (unspec:SI [
>  (const:SI (plus:SI (unspec:SI [
>  (const_int 0 [0])
> -] 21)
> +] UNSPEC_PIC_LABEL)
>  (const_int 8 [0x8])))
> -] 24))
> -] 3)) test.c:43 -1
> +] UNSPEC_GOTSYM_OFF))
> +] UNSPEC_PIC_SYM)) test.c:43 -1
>   (nil))
> 
>  (insn 10 9 11 2 (set (reg:SI 137)
> @@ -35,7 +35,7 @@
>  (reg:SI 137)
>  (const_int 8 [0x8])
>  (const_int 0 [0])
> -] 4)) test.c:43 -1
> +] UNSPEC_PIC_BASE)) test.c:43 -1
>   (nil))
> 
>  (insn 11 10 5 2 (use (reg:SI 137)) test.c:43 -1
> 
> Having the indexes defined by define_c_enum, GCC will use the
> enumerators' names rather than some magic numbers when printing out
> unspec/unspec_volatile expressions.
> 
> The patch does not change the code generation. The post-patch test
> result is the same as the pre-patch result (tested on qemu for armv7-a
> and arm-eabi).
> 
> OK for the trunk?
> 
> Cheers,
> Yufeng
> 
> 
> 2011-01-18  Yufeng Zhang  
> 
> * config/arm/arm.md (define_constants for unspec): Replace with
> define_c_enum.
> (define_constants for unspecv): Replace with define_c_enum.
> * config/arm/neon.md (define_constants for unspec): Replace with
> define_c_enum.

OK.

R.




Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Richard Guenther
On Wed, Mar 30, 2011 at 4:26 PM, Aldy Hernandez  wrote:
>
>>> So... do you have any important targets in mind, because I don't see this
>>> being a problem for most targets?  As can be expected, I am only
>>> interested
>>> in x86*, powerpc*, and s390, especially since a cursory glance on other
>>> important targets didn't exhibit any problems.  However, given my target
>>> bias, I am willing to look into any important targets that are
>>> problematic
>>> (I'm hoping none :)).
>>
>> Well, I'm not sure that strict-align targets that provide byte access do
>> not simply hide the issue inside the CPU (thus, perform the
>> read-modify-write
>> there and do not guarantee any atomicity unless you ask for it).  It might
>> be even worse - targets might not even guarantee this for shared
>> cache-lines
>> (for non-ccNUMA architectures).  But I'm no expert here, but certainly
>> every possible weird CPU architecture has been implemented.
>
> Whoops, sorry I missed your off-list followup from yesterday (I'm reading
> mail sequentially :)):
>
>> Richard Guenther said:
>> strict-align targets will end up doing read-modify-write operations on
>> word-size even when accessing single bytes.  Note that some CPUs
>> have byte store operations but they usually are not guaranteed to
>> be "atomic" (thus, they simply do the read-modify-write in the CPU).
>> I am not aware of any strict-align CPU that can do atomic byte stores.
>>
>> Obvious problem when for example having multiple non-word-size
>> global vars (unless you force them to word-alignment).
>
> I was not aware of how this played out internally.  This is certainly a
> problem.  I will hunt down hardware for at least arm, sparc, and ia64, and
> investigate.  But it may be that the only option will be to disallow the C++
> memory model on strictly aligned hardware, or perhaps force word-alignment.
>
> Is forcing word-alignment too big of a hammer, or will the users for these
> architectures be content with having no support for the C++0x memory model?

I think a memory model that cannot be reasonably (read: also fast) implemented
on all HW is screwed from the start and we should simply ditch it.  Which
is because nobody will use it as you cannot rely on it when writing
portable programs or it will be hell slow.

Richard.


Re: Continue toplevel cleanup (GCC library handling for unsupported targets etc.)

2011-03-30 Thread Richard Earnshaw

On Tue, 2011-03-29 at 20:55 +, Joseph S. Myers wrote:
> This patch continues cleaning up the toplevel configure.ac, in particular 
> as regards cases handling GCC libraries for targets where GCC is no longer 
> (or never was) supported.
> 
> The principle there, as discussed for the original deprecated targets 
> removal patch, is that in the absence of GCC support for a target it 
> doesn't matter exactly how toplevel is configured to build GCC libraries 
> (or other GCC-related directories) for that target - builds will and 
> should fail if GCC is included in the source tree - so toplevel 
> configure.ac should be set up for maximal simplicity.  In most cases that 
> means no explicit mention of GCC libraries for such a target; in some 
> cases it means removing the target entry altogether so that another case 
> becomes active instead (possibly the *-*-* case that disables libgcj 
> only).
> 
> In general cases just disabling libgcj are redundant with the *-*-* case 
> unless they prevent some other case being active for the target, and so 
> are removed.  There were also a couple of removed cases for targets with 
> no support in GCC or src at all (romp-*-*, vax-*-vms) - if there's no 
> support in GCC or src, it's right that a build should fail trying to build 
> whatever sources you have there, and silly for toplevel to try disabling 
> everything because nothing is supported.
> 
> Other cleanups here: noconfigdirs="$noconfigdirs" is completely useless 
> and was removed; c54x*-*-* is always mapped to tic54x-*-* by config.sub so 
> that case can be simplified; cris*-*-none acts just like cris*-*-elf in 
> config.gcc so it's appropriate to make the "*" subcase of cris*-*-* act 
> like the -elf case; mmix-*-* disabled "libgloss", i.e. libgloss for the 
> host, which is never built anyway.
> 
> OK to commit?
> 
> 2011-03-29  Joseph Myers  
> 

>   (arm-*-coff): Don't disable libgcj.
>   (arm*-*-linux-gnueabi): Remove useless assignment.
>   (arm-*-riscix*): Don't disable libgcj.


RISC iX support was removed from GCC years ago.  Looks like a tiny
fragment left over that wasn't cleaned up.  The other bits are fine with
me.

R.




Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Aldy Hernandez



So... do you have any important targets in mind, because I don't see this
being a problem for most targets?  As can be expected, I am only interested
in x86*, powerpc*, and s390, especially since a cursory glance on other
important targets didn't exhibit any problems.  However, given my target
bias, I am willing to look into any important targets that are problematic
(I'm hoping none :)).


Well, I'm not sure that strict-align targets that provide byte access do
not simply hide the issue inside the CPU (thus, perform the read-modify-write
there and do not guarantee any atomicity unless you ask for it).  It might
be even worse - targets might not even guarantee this for shared cache-lines
(for non-ccNUMA architectures).  But I'm no expert here, but certainly
every possible weird CPU architecture has been implemented.


Whoops, sorry I missed your off-list followup from yesterday (I'm 
reading mail sequentially :)):


> Richard Guenther said:
> strict-align targets will end up doing read-modify-write operations on
> word-size even when accessing single bytes.  Note that some CPUs
> have byte store operations but they usually are not guaranteed to
> be "atomic" (thus, they simply do the read-modify-write in the CPU).
> I am not aware of any strict-align CPU that can do atomic byte stores.
>
> Obvious problem when for example having multiple non-word-size
> global vars (unless you force them to word-alignment).

I was not aware of how this played out internally.  This is certainly a 
problem.  I will hunt down hardware for at least arm, sparc, and ia64, 
and investigate.  But it may be that the only option will be to disallow 
the C++ memory model on strictly aligned hardware, or perhaps force 
word-alignment.


Is forcing word-alignment too big of a hammer, or will the users for 
these architectures be content with having no support for the C++0x 
memory model?


Aldy


Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Michael Matz
Hi,

On Wed, 30 Mar 2011, Aldy Hernandez wrote:

> 
> > The memory model is not implementable on strict-alignment targets
> > that do not have a byte store operation.  But we previously said that ;)
> 
> Yes.  I think we should issue an error when we have such a target and the user
> tries -fmemory-model=c++0x.  However, how many strict-alignment targets are
> not byte addressable nowadays?

Consider cache aliasing, where the unit of coherence (absent using atomic 
instructions) is for instance 64 bytes.  I'm not sure how the mem-model 
could be implemented without generally falling back to atomics.

Or CPU internal write buffers that could (again if there are just normal 
writes, not atomics) reorder or merge write requests.  I think also that 
would destroy guarantees that the cxx-mem-model tries to provide.


Ciao,
Michael.


Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Richard Guenther
On Wed, Mar 30, 2011 at 4:11 PM, Aldy Hernandez  wrote:
>
>> The memory model is not implementable on strict-alignment targets
>> that do not have a byte store operation.  But we previously said that ;)
>
> Yes.  I think we should issue an error when we have such a target and the
> user tries -fmemory-model=c++0x.  However, how many strict-alignment targets
> are not byte addressable nowadays?
>
>> Also consider global vars
>>
>> char a;
>> char b;
>>
>> accessing them on strict-align targets may access adjacent globals
>> (that's a problem anyway, also with alias analysis).
>
> Good point.  I am adding a test to that effect (see attached patch).
>
> BTW, I assume you mean strict-align targets WITHOUT byte-addressability as
> above.  I have spot-checked your scenario on a handful of important targets
> that have strict alignment, and all of them work without touching adjacent
> global vars:
>
>        arm-elf         OK
>        sparc-linux     OK
>        ia64-linux      OK
>        alpha-linux     OK, but only with -mbwx (byte addressability)
>
> rth tells me that we shouldn't worry about ancient non-byte addressable
> Alphas, so the last isn't an issue.
>
> So... do you have any important targets in mind, because I don't see this
> being a problem for most targets?  As can be expected, I am only interested
> in x86*, powerpc*, and s390, especially since a cursory glance on other
> important targets didn't exhibit any problems.  However, given my target
> bias, I am willing to look into any important targets that are problematic
> (I'm hoping none :)).

Well, I'm not sure that strict-align targets that provide byte access do
not simply hide the issue inside the CPU (thus, perform the read-modify-write
there and do not guarantee any atomicity unless you ask for it).  It might
be even worse - targets might not even guarantee this for shared cache-lines
(for non-ccNUMA architectures).  But I'm no expert here, but certainly
every possible weird CPU architecture has been implemented.

Richard.


Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Aldy Hernandez



The memory model is not implementable on strict-alignment targets
that do not have a byte store operation.  But we previously said that ;)


Yes.  I think we should issue an error when we have such a target and 
the user tries -fmemory-model=c++0x.  However, how many strict-alignment 
targets are not byte addressable nowadays?



Also consider global vars

char a;
char b;

accessing them on strict-align targets may access adjacent globals
(that's a problem anyway, also with alias analysis).


Good point.  I am adding a test to that effect (see attached patch).

BTW, I assume you mean strict-align targets WITHOUT byte-addressability 
as above.  I have spot-checked your scenario on a handful of important 
targets that have strict alignment, and all of them work without 
touching adjacent global vars:


arm-elf OK
sparc-linux OK
ia64-linux  OK
alpha-linux OK, but only with -mbwx (byte addressability)

rth tells me that we shouldn't worry about ancient non-byte addressable 
Alphas, so the last isn't an issue.


So... do you have any important targets in mind, because I don't see 
this being a problem for most targets?  As can be expected, I am only 
interested in x86*, powerpc*, and s390, especially since a cursory 
glance on other important targets didn't exhibit any problems.  However, 
given my target bias, I am willing to look into any important targets 
that are problematic (I'm hoping none :)).


Let me know if you see anything else, and please take a quick peek at 
the attached patch below, which I will be committing shortly.


As usual, thanks.
Aldy
Index: testsuite/gcc.dg/memmodel/strict-align-global.c
===
--- testsuite/gcc.dg/memmodel/strict-align-global.c (revision 0)
+++ testsuite/gcc.dg/memmodel/strict-align-global.c (revision 0)
@@ -0,0 +1,46 @@
+/* { dg-do link } */
+/* { dg-options "-O2 --param allow-packed-store-data-races=0" } */
+/* { dg-final { memmodel-gdb-test } } */
+
+#include 
+#include "memmodel.h"
+
+/* This test verifies writes to globals do not write to adjacent
+   globals.  This mostly happens on strict-align targets that are not
+   byte addressable (old Alphas, etc).  */
+
+char a = 0;
+char b = 77;
+
+void memmodel_other_threads() 
+{
+}
+
+int memmodel_step_verify()
+{
+  if (b != 77)
+{
+  printf("FAIL: Unexpected value.   is %d, should be 77\n", b);
+  return 1;
+}
+  return 0;
+}
+
+/* Verify that every variable has the correct value.  */
+int memmodel_final_verify()
+{
+  int ret = memmodel_step_verify ();
+  if (a != 66)
+{
+  printf("FAIL: Unexpected value.   is %d, should be 66\n", a);
+  return 1;
+}
+  return ret;
+}
+
+int main ()
+{
+  a = 66;
+  memmodel_done();
+  return 0;
+}


RX: Sync with 4.6 branch

2011-03-30 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to sync the RX backend with the 4.6
  branch.  In practice this means bringing in the peepholes to combine
  extending loads and simple arithmetic expressions, and adjusting the
  memory move cost cost for stores.  (The insn length attribute is
  needed for the rx_max_skip_for_label function).

Cheers
  Nick

gcc/ChangeLog
2011-03-30  Nick Clifton  

* config/rx/rx.md: Add peepholes and patterns to combine
extending loads and simple arithmetic instructions.
* config/rx/rx.h (ADJUST_INSN_LENGTH): Define.
* config/rx/rx-protos.h (rx_adjust_insn_length): Prototype.
* config/rx/rx.c (rx_is_legitimate_address): Allow QI and HI
modes to use pre-decrement and post-increment addressing.
(rx_is_restricted_memory_address): Add range checking of REG+INT
addresses.
(rx_print_operand): Add support for %Q.
Fix handling of %Q.
(rx_memory_move_cost): Adjust cost of stores.
(rx_adjust_insn_length): New function.

Index: gcc/config/rx/rx.h
===
--- gcc/config/rx/rx.h  (revision 171716)
+++ gcc/config/rx/rx.h  (working copy)
@@ -630,3 +630,10 @@
 #define REGISTER_MOVE_COST(MODE,FROM,TO) 2
 
 #define SELECT_CC_MODE(OP,X,Y)  rx_select_cc_mode(OP, X, Y)
+
+#define ADJUST_INSN_LENGTH(INSN,LENGTH)\
+  do   \
+{  \
+  (LENGTH) = rx_adjust_insn_length ((INSN), (LENGTH)); \
+}  \
+  while (0)
Index: gcc/config/rx/rx-protos.h
===
--- gcc/config/rx/rx-protos.h   (revision 171716)
+++ gcc/config/rx/rx-protos.h   (working copy)
@@ -31,16 +31,17 @@
 extern int rx_initial_elimination_offset (int, int);
 
 #ifdef RTX_CODE
+extern int rx_adjust_insn_length (rtx, int);
 extern void rx_emit_stack_popm (rtx *, bool);
 extern void rx_emit_stack_pushm (rtx *);
 extern voidrx_expand_epilogue (bool);
 extern char *  rx_gen_move_template (rtx *, bool);
 extern boolrx_is_legitimate_constant (rtx);
 extern boolrx_is_restricted_memory_address (rtx, Mmode);
+extern boolrx_match_ccmode (rtx, Mmode);
 extern voidrx_notice_update_cc (rtx body, rtx insn);
 extern voidrx_split_cbranch (Mmode, Rcode, rtx, rtx, rtx);
 extern Mmode   rx_select_cc_mode (Rcode, rtx, rtx);
-extern boolrx_match_ccmode (rtx, Mmode);
 #endif
 
 #endif /* GCC_RX_PROTOS_H */
Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 171716)
+++ gcc/config/rx/rx.md (working copy)
@@ -1545,6 +1545,139 @@
(set_attr "length" "3,4,5,6,7,6")]
 )
 
+;; A set of peepholes to catch extending loads followed by arithmetic 
operations.
+;; We use iterators where possible to reduce the amount of typing and hence the
+;; possibilities for typos.
+
+(define_code_iterator extend_types [(zero_extend "") (sign_extend "")])
+(define_code_attr letter   [(zero_extend "R") (sign_extend "Q")])
+
+(define_code_iterator memex_commutative [(plus "") (and "") (ior "") (xor "")])
+(define_code_iterator memex_noncomm [(div "") (udiv "") (minus "")])
+(define_code_iterator memex_nocc[(smax "") (smin "") (mult "")])
+
+(define_code_attr op[(plus "add") (and "and") (div "div") 
(udiv "divu") (smax "max") (smin "min") (mult "mul") (ior "or") (minus "sub") 
(xor "xor")])
+
+(define_peephole2
+  [(set (match_operand:SI   0 "register_operand")
+   (extend_types:SI (match_operand:small_int_modes 1 
"rx_restricted_mem_operand")))
+   (parallel [(set (match_operand:SI2 "register_operand")
+  (memex_commutative:SI (match_dup 0)
+(match_dup 2)))
+ (clobber (reg:CC CC_REG))])]
+  "peep2_regno_dead_p (2, REGNO (operands[0]))"
+  [(parallel [(set:SI (match_dup 2)
+ (memex_commutative:SI (match_dup 2)
+   (extend_types:SI (match_dup 1
+ (clobber (reg:CC CC_REG))])]
+)
+
+(define_peephole2
+  [(set (match_operand:SI   0 "register_operand")
+   (extend_types:SI (match_operand:small_int_modes 1 
"rx_restricted_mem_operand")))
+   (parallel [(set (match_operand:SI2 "register_operand")
+  (memex_commutative:SI (match_dup 2)
+(match_dup 0)))
+ (clobber (reg:CC CC_REG))])]
+  "peep2_regno_dead_p (2, REGNO (operands[0]))"
+  [(parallel [(set:SI (match_dup 2)
+ (memex_commutativ

[Patch] Bfin: prevent stack checking clobbering P2

2011-03-30 Thread Henderson, Stuart
The attached patch prevents P2 getting clobbered by stack check code.


2011-03-30  Stuart Henderson  

From Jie Zhang
* config/bfin/bfin.c (bfin_expand_prologue): Don't clobber P2.


Thanks,
Stu


upstream.patch
Description: upstream.patch


Re: mt-mep using EXTRA_TARGET_HOST_ALL_MODULES?

2011-03-30 Thread Tom Tromey
> "Paolo" == Paolo Bonzini  writes:

I saw this thread due to Joseph's recent post about top-level configury
changes...

Paolo> * May in principle be in use (or be put to some use for libelf):
Paolo> fastjar libelf

I think fastjar is dead.

Paolo> * Probably present only for historical reasons (Cygnus distributions):
Paolo> ash bash bzip2 diff dosutils findutils gawk gettext gzip hello indent
Paolo> libiconv make patch rcs recode sed tar time uudecode wdiff expect
Paolo> guile

libiconv is still used by some people building for some gdb hosts.

Paolo> * Doesn't use Autoconf, so it's already broken:
Paolo> zip perl

We used zip for libjava before fastjar.  I don't recall whether it was
in-tree or not.

Random historical note -- back in the day, I wrote a bunch of Metaconfig
changes to make it so perl could be built in an autoconf tree :-).  This
was for a Cygnus product.  That is why perl appears in here.

Paolo> * What's this?
Paolo> gnuserv prms release send-pr target-examples target-qthreads

prms is the old name of gnats; send-pr is a bug-submitting script for
gnats.  Both dead.

target-qthreads was from a very early port of gcj.  It is a cooperative
threads package based on longjmp.  I think this was dead before libjava
was imported into the public GCC tree.

Tom


Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
On 30 March 2011 14:41, Richard Guenther  wrote:
> On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen  wrote:
>> On 30 March 2011 12:59, Richard Guenther  wrote:
>>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen  wrote:
 Hi,

 With this patch a data-ref is marked as unconditionally read or
 written also if its adjacent field is read or written unconditionally
 in the loop.
 My concern is that this is not safe enough, even though the fields
 have to be non-pointers and non-aggregates, and this optimization is
 applied only with -ftree-loop-if-convert-stores flag.

 Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

 OK for trunk?
>>>
>>> The restrictions do not make too much sense to me.  For the C++
>>> memory model we can't do speculative stores at all, but for the
>>> rest I'd say just checking if the data-refs access the same object
>>> is enough, thus, instead of same_data_refs (a, b) simply check
>>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
>>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
>>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
>>> the contents of the various DR fields are).
>>
>> But what about
>>
>> int a[10], b[100], c[100];
>> for (i = 0; i < 100; i++)
>>  {
>>     if (i < 10)
>>       t = a[i];
>>    else
>>       t = b[i];
>>
>>    c[i] = t + a[1];
>>   }
>>
>> We can't transform this to
>>
>> int a[10], b[100], c[100];
>> for (i = 0; i < 100; i++)
>>  {
>>     t1 = a[i];
>>     t2 = b[i];
>>     t = (i < 10) ? t1 : t2;
>>     c[i] = t + a[1];
>>   }
>>
>> since we create accesses to a[10:100], right?
>
> That's correct - we may only ever create "valid" accesses, but any
> valid access to a is ok after we see an unconditional access to a.
>
> So we probably have to restrict ourselves to DECL_P (get_base_address ())
> objects and make sure we don't access past it.
>
> Thus, I see what you were trying to do - may I suggest sth like
>
>  ref_base_a = DR_REF (a);
>  while (TREE_CODE (ref_base_a) == COMPONENT_REF
>           || TREE_CODE (ref_base_a) == IMAGPART_EXPR
>           || TREE_CODE (ref_base_a) == REALPART_EXPR)
>    ref_base_a = TREE_OPERAND (ref_base_a, 0);
>
> ... same for DR_REF (b) ...
>
>  if (operand_equal_p (ref_base_a, ref_base_b, 0))
>    ok
>
> that is, strip all non-variable offset handled_component_refs and compare
> the remaining base of the two accesses.  If they are equal we are ok.
>
> Any hole in that?

I don't see any :) I'll test your version.

Thanks,
Ira

>
> Thanks,
> Richard.
>
>> Thanks,
>> Ira
>>
>>>
>>> Thanks,
>>> Richard.
>>>
 Thanks,
 Ira


 ChangeLog:

         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
 true
         if an adjacent field of the data-ref is accessed unconditionally.

 testsuite/ChangeLog:

         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
         -ftree-loop-if-convert-stores.

>>>
>>
>


Negative option alias facility

2011-03-30 Thread Joseph S. Myers
This patch adds a NegativeAlias .opt facility, for one option to be
marked as an alias for the negative form of another option.  There are
quite a lot of options like that, and this is of particular use where
it allows option handler code to be removed because the aliases can
now be handled purely in the .opt file (as opposed to the case where
both positive and negative options are fully described by their Var
settings).

In particular, this patch also converts the legacy rs6000 options
-mvrsave=, -misel=, -mspe= to aliases for the standard forms of the
options (via listing the =yes and =no forms of each option as separate
aliases), so simplifying the rs6000 option handling code.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu, and
tested building cc1 and xgcc for cross to powerpc-eabispe.  Will
commit to trunk in the absence of target maintainer objections.

2011-03-30  Joseph Myers  

* doc/options.texi (NegativeAlias): Document.
(Alias): Mention NegativeAlias.
* opt-functions.awk: Handle NegativeAlias.
* optc-gen.awk: Disallow NegativeAlias with multiple Alias
arguments.
* opts-common.c (decode_cmdline_option): Handle CL_NEGATIVE_ALIAS.
* opts.h (CL_NEGATIVE_ALIAS): Define.
* config/rs6000/rs6000.c (rs6000_parse_yes_no_option): Remove.
(rs6000_handle_option): Don't handle OPT_mvrsave_, OPT_misel_ and
OPT_mspe_.
* config/rs6000/rs6000.opt (mvrsave=, misel=, mspe=): Replace with
Alias entries.
* config/rs6000/t-spe (MULTILIB_OPTIONS, MULTILIB_EXCEPTIONS): Use
mno-spe and mno-isel instead of mspe=no and -misel=no.

Index: gcc/doc/options.texi
===
--- gcc/doc/options.texi(revision 171701)
+++ gcc/doc/options.texi(working copy)
@@ -364,7 +364,8 @@ for later processing.
 @item Alias(@var{opt})
 @itemx Alias(@var{opt}, @var{arg})
 @itemx Alias(@var{opt}, @var{posarg}, @var{negarg})
-The option is an alias for @option{-@var{opt}}.  In the first form,
+The option is an alias for @option{-@var{opt}} (or the negative form
+of that option, depending on @code{NegativeAlias}).  In the first form,
 any argument passed to the alias is considered to be passed to
 @option{-@var{opt}}, and @option{-@var{opt}} is considered to be
 negated if the alias is used in negated form.  In the second form, the
@@ -387,6 +388,13 @@ not need to handle it and no @samp{OPT_}
 for it; only the canonical form of the option will be seen in those
 places.
 
+@item NegativeAlias
+For an option marked with @code{Alias(@var{opt})}, the option is
+considered to be an alias for the positive form of @option{-@var{opt}}
+if negated and for the negative form of @option{-@var{opt}} if not
+negated.  @code{NegativeAlias} may not be used with the forms of
+@code{Alias} taking more than one argument.
+
 @item Ignore
 This option is ignored apart from printing any warning specified using
 @code{Warn}.  The option will not be seen by specs and no @samp{OPT_}
Index: gcc/opts-common.c
===
--- gcc/opts-common.c   (revision 171701)
+++ gcc/opts-common.c   (working copy)
@@ -507,6 +507,7 @@ decode_cmdline_option (const char **argv
{
  gcc_assert (option->alias_arg != NULL);
  gcc_assert (arg == NULL);
+ gcc_assert (!(option->flags & CL_NEGATIVE_ALIAS));
  if (value)
arg = option->alias_arg;
  else
@@ -517,9 +518,13 @@ decode_cmdline_option (const char **argv
{
  gcc_assert (value == 1);
  gcc_assert (arg == NULL);
+ gcc_assert (!(option->flags & CL_NEGATIVE_ALIAS));
  arg = option->alias_arg;
}
 
+ if (option->flags & CL_NEGATIVE_ALIAS)
+   value = !value;
+
  opt_index = new_opt_index;
  option = new_option;
 
Index: gcc/opts.h
===
--- gcc/opts.h  (revision 171701)
+++ gcc/opts.h  (working copy)
@@ -115,6 +115,7 @@ extern const unsigned int cl_lang_count;
 #define CL_MISSING_OK  (1U << 28) /* Missing argument OK (joined).  */
 #define CL_UINTEGER(1U << 29) /* Argument is an integer >=0.  */
 #define CL_UNDOCUMENTED(1U << 30) /* Do not output with 
--help.  */
+#define CL_NEGATIVE_ALIAS  (1U << 31) /* Alias to negative form of option. 
 */
 
 /* Flags for an enumerated option argument.  */
 #define CL_ENUM_CANONICAL  (1 << 0) /* Canonical for this value.  */
Index: gcc/optc-gen.awk
===
--- gcc/optc-gen.awk(revision 171701)
+++ gcc/optc-gen.awk(working copy)
@@ -362,6 +362,11 @@ for (i = 0; i < n_opts; i++) {
print "#error Alias with single argument " \
"allow

Re: fix for 48208 and 48260 on darwin

2011-03-30 Thread Christian Schüler

Ok so I needed to install dejagnu from macports.
The comparison of the test results turned out empty except one output 
that made no sense:

--

Tests that now fail, but worked before:

g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, 
line 7)


Tests that now work, but didn't before:

g++.dg/warn/Wstrict-aliasing-float-ref-int-obj.C  (test for warnings, 
line 7)


---
There was also this:

sed: 1: "/#if[ \t]*!defined(__cpl ...": command c expects \ followed by text
sed: stdout: Broken pipe

Missing header fix:  complex.h

There were fixinclude test FAILURES

---
Otherwise the summary is this:

=== gcc Summary ===

# of expected passes62994
# of unexpected failures49
# of expected failures211
# of unsupported tests1606

=== g++ Summary ===

# of expected passes25521
# of expected failures160
# of unsupported tests365

=== libstdc++ Summary ===

# of expected passes6984
# of expected failures81
# of unsupported tests661

=== libgomp Summary ===

# of expected passes1025
# of unsupported tests2


On 28.03.11 19:58, Mike Stump - mikest...@comcast.net wrote:

On Mar 27, 2011, at 8:55 PM, Christian Schüler wrote:

please review the following patch (and besides, bear with me as this is the
first patch proposal from me). For gcc 4.5 and earlier it was possible to
configure xcode via a modified xcplugin to use the newer gcc directly (yes,
the -arch flag was ignored during link time, but since I had -m32/-m64
anyways, this didn't matter). With the new argument checking of version 4.6
during the linking stage this is no longer possible. The proposed patch makes
gcc on darwin recognize the -arch flag to the extent necessary to reinstate
the old functionality. Besides, it fixes the already existing support for the
-F flag.

Ok (when regression tested).  Did you run the gcc regression testsuite on it?  
To install it, we'd need a testsuite run on it.  Roughy, run the testsuite 
before your patch, save off the .sum file, run it after your patch, then run 
contrib/compare_tests before.sum after.sum and see if there is any output.  You 
might need expect/tcl/dejagnu installed to do that.



If you want to contribute longer term (larger patches), you'll want to get 
started on your paper work, if you've not done that already.  And, welcome.  
Before you say, but...  let me say, what took so long, 10 years is a long 
while?  :-)



Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Richard Guenther
On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen  wrote:
> On 30 March 2011 12:59, Richard Guenther  wrote:
>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen  wrote:
>>> Hi,
>>>
>>> With this patch a data-ref is marked as unconditionally read or
>>> written also if its adjacent field is read or written unconditionally
>>> in the loop.
>>> My concern is that this is not safe enough, even though the fields
>>> have to be non-pointers and non-aggregates, and this optimization is
>>> applied only with -ftree-loop-if-convert-stores flag.
>>>
>>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>>>
>>> OK for trunk?
>>
>> The restrictions do not make too much sense to me.  For the C++
>> memory model we can't do speculative stores at all, but for the
>> rest I'd say just checking if the data-refs access the same object
>> is enough, thus, instead of same_data_refs (a, b) simply check
>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
>> the contents of the various DR fields are).
>
> But what about
>
> int a[10], b[100], c[100];
> for (i = 0; i < 100; i++)
>  {
>     if (i < 10)
>       t = a[i];
>    else
>       t = b[i];
>
>    c[i] = t + a[1];
>   }
>
> We can't transform this to
>
> int a[10], b[100], c[100];
> for (i = 0; i < 100; i++)
>  {
>     t1 = a[i];
>     t2 = b[i];
>     t = (i < 10) ? t1 : t2;
>     c[i] = t + a[1];
>   }
>
> since we create accesses to a[10:100], right?

That's correct - we may only ever create "valid" accesses, but any
valid access to a is ok after we see an unconditional access to a.

So we probably have to restrict ourselves to DECL_P (get_base_address ())
objects and make sure we don't access past it.

Thus, I see what you were trying to do - may I suggest sth like

  ref_base_a = DR_REF (a);
  while (TREE_CODE (ref_base_a) == COMPONENT_REF
   || TREE_CODE (ref_base_a) == IMAGPART_EXPR
   || TREE_CODE (ref_base_a) == REALPART_EXPR)
ref_base_a = TREE_OPERAND (ref_base_a, 0);

... same for DR_REF (b) ...

  if (operand_equal_p (ref_base_a, ref_base_b, 0))
ok

that is, strip all non-variable offset handled_component_refs and compare
the remaining base of the two accesses.  If they are equal we are ok.

Any hole in that?

Thanks,
Richard.

> Thanks,
> Ira
>
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Ira
>>>
>>>
>>> ChangeLog:
>>>
>>>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
>>> true
>>>         if an adjacent field of the data-ref is accessed unconditionally.
>>>
>>> testsuite/ChangeLog:
>>>
>>>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>>>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>>>         -ftree-loop-if-convert-stores.
>>>
>>
>


Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
On 30 March 2011 12:59, Richard Guenther  wrote:
> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen  wrote:
>> Hi,
>>
>> With this patch a data-ref is marked as unconditionally read or
>> written also if its adjacent field is read or written unconditionally
>> in the loop.
>> My concern is that this is not safe enough, even though the fields
>> have to be non-pointers and non-aggregates, and this optimization is
>> applied only with -ftree-loop-if-convert-stores flag.
>>
>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>>
>> OK for trunk?
>
> The restrictions do not make too much sense to me.  For the C++
> memory model we can't do speculative stores at all, but for the
> rest I'd say just checking if the data-refs access the same object
> is enough, thus, instead of same_data_refs (a, b) simply check
> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
> the contents of the various DR fields are).

But what about

int a[10], b[100], c[100];
for (i = 0; i < 100; i++)
  {
 if (i < 10)
   t = a[i];
else
   t = b[i];

c[i] = t + a[1];
   }

We can't transform this to

int a[10], b[100], c[100];
for (i = 0; i < 100; i++)
  {
 t1 = a[i];
 t2 = b[i];
 t = (i < 10) ? t1 : t2;
 c[i] = t + a[1];
   }

since we create accesses to a[10:100], right?

Thanks,
Ira

>
> Thanks,
> Richard.
>
>> Thanks,
>> Ira
>>
>>
>> ChangeLog:
>>
>>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
>> true
>>         if an adjacent field of the data-ref is accessed unconditionally.
>>
>> testsuite/ChangeLog:
>>
>>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>>         -ftree-loop-if-convert-stores.
>>
>


Re: [PATCH, ARM] PR47855 Compute attr "length" for some thumb2 insns

2011-03-30 Thread Richard Earnshaw

On Wed, 2011-03-30 at 12:38 +0800, Chung-Lin Tang wrote:
> On 2011/3/30 06:35 AM, Ramana Radhakrishnan wrote:
> > Hi Carrot,
> > 
> > On 26/03/11 15:14, Carrot Wei wrote:
> >> Index: arm.md
> >> ===
> >> --- arm.md(revision 171337)
> >> +++ arm.md(working copy)
> >> @@ -7115,7 +7115,18 @@
> >> "@
> >>  cmp%?\\t%0, %1
> >>  cmn%?\\t%0, #%n1"
> >> -  [(set_attr "conds" "set")]
> >> +  [(set_attr "conds" "set")
> >> +   (set (attr "length")
> >> + (if_then_else
> >> +   (and (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
> >> + (eq (symbol_ref "which_alternative") (const_int 0)))
> >> +(ior (ne (symbol_ref "REG_P (operands[1])") (const_int 0))
> >> +(and (ne (symbol_ref "CONST_INT_P (operands[1])") (const_int 0))
> >> + (and (ge (symbol_ref "INTVAL (operands[1])") (const_int 0))
> >> +  (le (symbol_ref "INTVAL (operands[1])")
> >> +  (const_int 255))
> >> +   (const_int 2)
> >> +   (const_int 4)))]
> >>   )
> > 
> > How about adding an alternative only enabled for T2 that uses the
> > `l' constraint and inventing new constraints for some of the constant
> > values that are valid for 16 bit instructions since the `I' and `L'
> > constraints have different meanings depending on whether TARGET_32BIT is
> > valid or not ? We could then set the value of the length attribute
> > accordingly. I don't think we can change the meaning of the I and L
> > constraints very easily given the amount of churn that might be needed
> > 
> > 
> >>
> >>   (define_insn "*cmpsi_shiftsi"
> >> @@ -7286,7 +7297,14 @@
> >> return \"b%d1\\t%l0\";
> >> "
> >> [(set_attr "conds" "use")
> >> -   (set_attr "type" "branch")]
> >> +   (set_attr "type" "branch")
> >> +   (set (attr "length")
> >> +(if_then_else
> >> +   (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
> >> +(and (ge (minus (match_dup 0) (pc)) (const_int -250))
> >> + (le (minus (match_dup 0) (pc)) (const_int 256
> >> +   (const_int 2)
> >> +   (const_int 4)))]
> >>   )
> > 
> > I don't think this and the other conditional branch instruction
> > changes are correct. This could end up being the last instruction in an
> > IT block and will automatically end up getting the 32 bit encoding and
> > end up causing trouble with the length calculations. Remember the 16 bit
> > encoding for the conditional instruction can't be used as the last
> > instruction in an IT block.
> > 
> > 
> >> @@ -10256,7 +10288,26 @@
> >>
> >>   return \"\";
> >> }"
> >> -  [(set_attr "type" "store4")]
> >> +  [(set_attr "type" "store4")
> >> +   (set (attr "length")
> >> +(if_then_else
> >> +   (and (ne (symbol_ref "TARGET_THUMB2") (const_int 0))
> >> +(ne (symbol_ref "{
> >> +int i, regno, hi_reg;
> >> +int num_saves = XVECLEN (operands[2], 0);
> >> +regno = REGNO (operands[1]);
> >> +hi_reg = (REGNO_REG_CLASS (regno) == HI_REGS)
> >> +&&  (regno != LR_REGNUM);
> >> +for (i = 1; i<  num_saves; i++)
> > 
> > 
> > (i < num_saves && (hi_reg == 0)) - what's the point of going through
> > the register list when hi_reg != 0 in this case ? A comment to explain
> > that the length calculation *must* be kept in sync with the template
> > above might be useful.
> 
> I suggest abstracting all of this Thumb-2 insn length knowledge into a
> arm.c function, say "thumb2_16bit_insn_p()" to test for 2-byte patterns.
>  Adding more of these quite large pieces of logic into arm.md does not
> look very pretty to me, and centralizing it into a C function will also
> make it easier to manage the cases.
> 
> Chung-Lin
> 

I think Ramana's suggestion should avoid the need for any arithmetic at
all here.  If the constraints precisely match a 16-bit instruction (and
that alternative is disbled for ARM) then we can say unconditionally
that the length of the insn is 2.

R.




Re: RFA: patch to solve IRA PR48336, PR48342, PR48345

2011-03-30 Thread Hans-Peter Nilsson
On Tue, 29 Mar 2011, Hans-Peter Nilsson wrote:
> FWIW, I have a five regressions for cris-elf too appearing at
> that RA change, but as they're wrong-code and noticed only at
> execution, I'm going to wait analyzing further until this is
> committed and caught by my autotester.  Just a heads-up.

Lest you lose sleep over this, it was fixed at r171716. ;)

brgds, H-P


Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g

2011-03-30 Thread Julian Brown
On Tue, 29 Mar 2011 11:45:58 +0100
Richard Sandiford  wrote:

> Julian Brown  writes:
> > On Thu, 24 Mar 2011 10:57:06 +
> > Richard Sandiford  wrote:
> >
> >> Chung-Lin Tang  writes:
> >> > PR48183 is a case where ARM NEON instrinsics, under -O -g,
> >> > produce debug insns that tries to expand OImode (32-byte
> >> > integer) zero constants, much too large to represent as two
> >> > HOST_WIDE_INTs; as the internals manual indicates, such large
> >> > constants are not supported in general, and ICEs on the
> >> > GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
> >> >
> >> > This patch allows the cases where the large integer constant is
> >> > still representable using a single CONST_INT, such as zero(0).
> >> > Bootstrapped and tested on i686 and x86_64, cross-tested on ARM,
> >> > all without regressions. Okay for trunk?
> >> >
> >> > Thanks,
> >> > Chung-Lin
> >> >
> >> > 2011-03-20  Chung-Lin Tang  
> >> >
> >> >  * emit-rtl.c (immed_double_const): Allow wider than
> >> >  2*HOST_BITS_PER_WIDE_INT mode constants when they are
> >> >  representable as a single const_int RTX.
> >> 
> >> I realise this might be seen as a good expedient fix, but it makes
> >> me a bit uneasy.  Not a very constructive rationale, sorry.
> >
> > FWIW I also had a "fix" for this issue, which is equivalent to
> > Chung-Lin's patch apart from only allowing constant-zero (attached).
> > That's not really a vote from me for this approach, but maybe
> > limiting the extent to which we pretend to support wide-integer
> > constants like this is sensible, if we do go that way.
> 
> So that each suggestion has a patch, here's mine.  I've snipped the
> regenerated file from the testcase, but to get a flavour, the changes
> are like this:

The O'Caml changes look fine, FWIW (I can't officially approve them, of
course) -- and I certainly don't have a problem with avoiding my or
Chung-Lin's hacks, though of course it'd be nice to get rid of the
wide-integer constant "problem" properly at some point.

Cheers,

Julian


[4.4] Fix memory clobber in c_lex_with_flags

2011-03-30 Thread Andreas Krebbel
Hi,

with GCC 4.4 compiling a testcase like "%:%:" on s390x triggers the
stack smashing protector.

Uli fixed this for GCC 4.5 with:
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg00755.html

Ok to apply that patch on 4.4 branch?

Bye,

-Andreas-


Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Richard Guenther
On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen  wrote:
> Hi,
>
> With this patch a data-ref is marked as unconditionally read or
> written also if its adjacent field is read or written unconditionally
> in the loop.
> My concern is that this is not safe enough, even though the fields
> have to be non-pointers and non-aggregates, and this optimization is
> applied only with -ftree-loop-if-convert-stores flag.
>
> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>
> OK for trunk?

The restrictions do not make too much sense to me.  For the C++
memory model we can't do speculative stores at all, but for the
rest I'd say just checking if the data-refs access the same object
is enough, thus, instead of same_data_refs (a, b) simply check
operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
(DR_REF (b)), 0), whatever makes more sense (I always confuse what
the contents of the various DR fields are).

Thanks,
Richard.

> Thanks,
> Ira
>
>
> ChangeLog:
>
>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return 
> true
>         if an adjacent field of the data-ref is accessed unconditionally.
>
> testsuite/ChangeLog:
>
>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>         -ftree-loop-if-convert-stores.
>


PATCH: PR target/48349: FLOAT_SSE_REGS typo in i386.h

2011-03-30 Thread H.J. Lu
Hi,

I checked this pre-approved patch into trunk/4.6/4.5/4.4.


H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 171717)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-03-30  H.J. Lu  
+
+   PR target/48349
+   * config/i386/i386.h (REG_CLASS_CONTENTS): Fix a typo in
+   FLOAT_SSE_REGS.
+
 2011-03-30  Joseph Myers  
Rainer Orth  
 
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 171717)
+++ config/i386/i386.h  (working copy)
@@ -1272,7 +1272,7 @@ enum reg_class
 { 0xe000,0x1f },   /* MMX_REGS */  \
 { 0x1fe00100,0x1fe000 },   /* FP_TOP_SSE_REG */\
 { 0x1fe00200,0x1fe000 },   /* FP_SECOND_SSE_REG */ \
-{ 0x1fe0ff00,0x3fe000 },   /* FLOAT_SSE_REGS */\
+{ 0x1fe0ff00,0x1fe000 },   /* FLOAT_SSE_REGS */\
{ 0x1,  0x1fe0 },   /* FLOAT_INT_REGS */\
 { 0x1fe100ff,0x1fffe0 },   /* INT_SSE_REGS */  \
 { 0x1fe1,0x1fffe0 },   /* FLOAT_INT_SSE_REGS */\


Re: RFA: patch to solve IRA PR48336, PR48342, PR48345

2011-03-30 Thread Richard Sandiford
Hi Vlad,

First, I want to echo H-P's thanks for tackling this area.  I just wondered:

Vladimir Makarov  writes:
> The following patch is to solve PR48336, PR48342, PR48345.  The 
> profitable hard regs exclude hard regs which are prohibited for the 
> corresponding allocno mode. It is true for primary allocation and it is 
> important for better colorability criteria.  Function assign_hard_reg is 
> also based on this assumption.  Unfortunately, it is not true for 
> secondary allocation (after IRA IR flattening or during reload).  The 
> following patch solves this problem.
>
> The patch should be very safe but I am still testing it on x86/x86-64 
> bootstrap.
[...]
> Index: ira-color.c
> ===
> --- ira-color.c   (revision 171699)
> +++ ira-color.c   (working copy)
> @@ -1447,7 +1447,9 @@ update_conflict_hard_regno_costs (int *c
>  }
>  
>  /* Set up conflicting and profitable regs (through CONFLICT_REGS and
> -   PROFITABLE_REGS) for each object of allocno A.  */
> +   PROFITABLE_REGS) for each object of allocno A.  Remember that the
> +   profitable regs exclude hard regs which can not hold value of mode
> +   of allocno A.  */
>  static inline void
>  setup_conflict_profitable_regs (ira_allocno_t a, bool retry_p,
>   HARD_REG_SET *conflict_regs,
> @@ -1463,8 +1465,13 @@ setup_conflict_profitable_regs (ira_allo
>COPY_HARD_REG_SET (conflict_regs[i],
>OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
>if (retry_p)
> - COPY_HARD_REG_SET (profitable_regs[i],
> -reg_class_contents[ALLOCNO_CLASS (a)]);
> + {
> +   COPY_HARD_REG_SET (profitable_regs[i],
> +  reg_class_contents[ALLOCNO_CLASS (a)]);
> +   AND_COMPL_HARD_REG_SET (profitable_regs[i],
> +   ira_prohibited_class_mode_regs
> +   [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
> + }
>else
>   COPY_HARD_REG_SET (profitable_regs[i],
>  OBJECT_COLOR_DATA (obj)->profitable_hard_regs);

ira_prohibited_class_mode_regs is partly based on HARD_REGNO_MODE_OK,
which is really a property of the first register in a multi-register
group (rather than of every register in that group).  So is
ira_prohibited_class_mode_regs defined in the same way?
At the moment, check_hard_reg_p and setup_allocno_available_regs_num
test profitability for every register in the group:

/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
|| ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
  break;
[...]
  for (j = 0; j < nregs; j++)
{
[...]
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
  || ! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs,
  hard_regno + j))
break;

So if you have a target in which double-word values have to start in
even registers, I think every odd-numbered bit of profitable_hard_regs
will be clear, and no register will seem profitable.  (I'm seeing this
on ARM with some VFP tests.)

Restricting the test to the first register fixes things for me,
but do we need to check something else for the j != 0 case?

Richard


gcc/
* ira-color.c (check_hard_reg_p): Restrict the profitability check
to the first register.
(setup_allocno_available_regs_num): Likewise.

Index: gcc/ira-color.c
===
--- gcc/ira-color.c (revision 171653)
+++ gcc/ira-color.c (working copy)
@@ -1497,7 +1504,8 @@ check_hard_reg_p (ira_allocno_t a, int h
   for (k = set_to_test_start; k < set_to_test_end; k++)
/* Checking only profitable hard regs.  */
if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)
-   || ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno + j))
+   || (j == 0
+   && ! TEST_HARD_REG_BIT (profitable_regs[k], hard_regno)))
  break;
   if (k != set_to_test_end)
break;
@@ -2226,8 +2234,9 @@ setup_allocno_available_regs_num (ira_al
  /* Checking only profitable hard regs.  */
  if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
 hard_regno + j)
- || ! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs,
- hard_regno + j))
+ || (j == 0
+ && ! TEST_HARD_REG_BIT (obj_data->profitable_hard_regs,
+ hard_regno)))
break;
}
  if (k != set_to_test_end)


Re: [cxx-mem-model] bitfield tests

2011-03-30 Thread Richard Guenther
On Tue, Mar 29, 2011 at 7:00 PM, Aldy Hernandez  wrote:
> [Language lawyers, please correct me if I have mis-interpreted the upcoming
> standard in any way.]
>
> In the C++ memory model, contiguous bitfields comprise a single memory
> location, so it's fair game to bit twiddle them when setting them.  For
> example:
>
>        struct {
>                unsigned int a : 4;
>                unsigned int b : 4;
>                unsigned int c : 4;
>        };
>
> In the above example, you can touch  and  while setting .  No race
> there.
>
> However, non contiguous bitfields are a different story:
>
>        struct {
>                unsigned int a : 4;
>                char b;
>                unsigned int c : 6;
>        };
>
> Here we have 3 distinct memory locations, so you can't touch  or 
> while setting .  No bit twiddling allowed.
>
> Similarly for bitfields separated by a zero-length bitfield:
>
>        struct {
>                unsigned int a : 4;
>                int : 0;
>                unsigned int c : 6;
>        };
>
> In the above example,  and  are distinct memory locations.
>
> Also, a structure/union boundary will also separate previously contiguous
> bit sequences:
>
>        struct {
>                unsigned int a : 4;
>                struct { unsigned int b : 4 } BBB;
>                unsigned int c : 4;
>        };
>
> Here we have 3 distinct memory locations, so again, we can't clobber  or
>  while setting .
>
> The patch below adds a non-contiguous bit test (bitfields-2.C) which passes
> on x86, but upon assembly inspection, fails on PPC64, s390, and Alpha.
>  These 3 architectures bit-twiddle their way out of the problem.

The memory model is not implementable on strict-alignment targets
that do not have a byte store operation.  But we previously said that ;)

Also consider global vars

char a;
char b;

accessing them on strict-align targets may access adjacent globals
(that's a problem anyway, also with alias analysis).

Richard.

> There is also a similar test already in the testsuite (bitfields.C) which is
> similar except one field is a volatile.  This test fails on x86-64 as well
> and is the subject of PR48128 which Jakub is currently tackling.
>
> As soon as Jakub finishes with PR48128, I will be working on getting these
> bitfield tests working in a C++ memory model fashion.
>
> Committing to branch.
>


Re: [PATCH] Another EQ/NE folder type fix (PR c/48305)

2011-03-30 Thread Richard Guenther
On Tue, Mar 29, 2011 at 5:53 PM, Jakub Jelinek  wrote:
> Hi!
>
> I've left these 4 out because I thought they should be fine given
> that the other operands are equal.  But the testcase shows I was wrong.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
> for trunk?

Ok.

Thanks,
Richard.

> 2011-03-29  Jakub Jelinek  
>
>        PR c/48305
>        * fold-const.c (fold_binary_loc) : Make sure
>        arg10/arg11 in (X ^ Y) == (Z ^ W) are always fold converted to
>        matching arg00/arg01 types.
>
>        * gcc.c-torture/compile/pr48305.c: New test.
>
> --- gcc/fold-const.c.jj 2011-03-25 07:56:32.0 +0100
> +++ gcc/fold-const.c    2011-03-28 12:45:58.0 +0200
> @@ -12625,13 +12625,21 @@ fold_binary_loc (location_t loc,
>             operand_equal_p guarantees no side-effects so we don't need
>             to use omit_one_operand on Z.  */
>          if (operand_equal_p (arg01, arg11, 0))
> -           return fold_build2_loc (loc, code, type, arg00, arg10);
> +           return fold_build2_loc (loc, code, type, arg00,
> +                                   fold_convert_loc (loc, TREE_TYPE (arg00),
> +                                                     arg10));
>          if (operand_equal_p (arg01, arg10, 0))
> -           return fold_build2_loc (loc, code, type, arg00, arg11);
> +           return fold_build2_loc (loc, code, type, arg00,
> +                                   fold_convert_loc (loc, TREE_TYPE (arg00),
> +                                                     arg11));
>          if (operand_equal_p (arg00, arg11, 0))
> -           return fold_build2_loc (loc, code, type, arg01, arg10);
> +           return fold_build2_loc (loc, code, type, arg01,
> +                                   fold_convert_loc (loc, TREE_TYPE (arg01),
> +                                                     arg10));
>          if (operand_equal_p (arg00, arg10, 0))
> -           return fold_build2_loc (loc, code, type, arg01, arg11);
> +           return fold_build2_loc (loc, code, type, arg01,
> +                                   fold_convert_loc (loc, TREE_TYPE (arg01),
> +                                                     arg11));
>
>          /* Optimize (X ^ C1) op (Y ^ C2) as (X ^ (C1 ^ C2)) op Y.  */
>          if (TREE_CODE (arg01) == INTEGER_CST
> --- gcc/testsuite/gcc.c-torture/compile/pr48305.c.jj    2011-03-28 
> 12:48:02.0 +0200
> +++ gcc/testsuite/gcc.c-torture/compile/pr48305.c       2011-03-28 
> 12:47:49.0 +0200
> @@ -0,0 +1,7 @@
> +/* PR c/48305 */
> +
> +int
> +foo (int x)
> +{
> +  return (x ^ 1) == (x ^ 1U);
> +}
>
>        Jakub
>


Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Richard Earnshaw

On Wed, 2011-03-30 at 15:35 +0800, Chung-Lin Tang wrote:
> On 2011/3/30 上午 12:23, Richard Earnshaw wrote:
> > 
> > On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
> >> On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
> >>> On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
>  On 2011/3/24 06:51 PM, Richard Earnshaw wrote:
> >
> > On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
> >> Hi,
> >> PR48250 happens under TARGET_NEON, where DImode is included within the
> >> valid NEON modes. This turns the range of legitimate constant indexes 
> >> to
> >> step-4 (coproc load/store), thus arm_legitimize_reload_address() when
> >> trying to decompose the [reg+index] reload address into
> >> [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
> >> part is not aligned to 4.
> >>
> >> I'm not sure why the current DImode index is computed as:
> >> low = ((val & 0xf) ^ 0x8) - 0x8;  the sign-extending into negative
> >> values, then subtracting back, actually creates further off indexes.
> >> e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].
> >>
> >
> > Hysterical Raisins... the code there was clearly written for the days
> > before we had LDRD in the architecture.  At that time the most efficient
> > way to load a 64-bit object was to use the LDM{ia,ib,da,db}
> > instructions.  The computation here was (I think), intended to try and
> > make the most efficient use of an add/sub instruction followed by
> > LDM/STM offsetting.  At that time the architecture had no unaligned
> > access either, so dealing with 64-bit that were less than 32-bit aligned
> > (in those days 32-bit was the maximum alignment) probably wasn't
> > considered, or couldn't even get through to reload.
> >
> 
>  I see it now. The code in output_move_double() returning assembly for
>  ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.
> 
>  I have changed the patch to let the new code handle the TARGET_LDRD case
>  only.  The pre-LDRD case is still handled by the original code, with an
>  additional & ~0x3 for aligning the offset to 4.
> 
>  I've also added a comment for the pre-TARGET_LDRD case. Please see if
>  the description is accurate enough.
> 
> >> My patch changes the index decomposing to a more straightforward way; 
> >> it
> >> also sort of outlines the way the other reload address indexes are
> >> broken by using and-masks, is not the most effective.  The address is
> >> computed by addition, subtracting away the parts to obtain low+high
> >> should be the optimal way of giving the largest computable index range.
> >>
> >> I have included a few Thumb-2 bits in the patch; I know currently
> >> arm_legitimize_reload_address() is only used under TARGET_ARM, but I
> >> guess it might eventually be turned into TARGET_32BIT.
> >>
> >
> > I think this needs to be looked at carefully on ARMv4/ARMv4T to check
> > that it doesn't cause regressions there when we don't have LDRD in the
> > instruction set.
> 
>  I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
>  Okay for trunk if no regressions?
> 
>  Thanks,
>  Chung-Lin
> 
>   PR target/48250
>   * config/arm/arm.c (arm_legitimize_reload_address): Adjust
>   DImode constant index decomposing under TARGET_LDRD. Clear
>   lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
>   comment for !TARGET_LDRD case.
> >>>
> >>> This looks technically correct, but I can't help feeling that trying to
> >>> deal with the bottom two bits being non-zero is not really worthwhile.
> >>> This hook is an optimization that's intended to generate better code
> >>> than the default mechanisms that reload provides.  It is allowed to
> >>> fail, but it must say so if it does (by returning false).
> >>>
> >>> What this hook is trying to do for DImode is to take an address of the
> >>> form (reg + TOO_BIG_CONST) and turn it into two instructions that are
> >>> legitimate:
> >>>
> >>>   tmp = (REG + LEGAL_BIG_CONST)
> >>>   some_use_of (mem (tmp + SMALL_LEGAL_CONST))
> >>>
> >>> The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
> >>> instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
> >>> impossible in ARM state) that pushing the bottom two bits of the address
> >>> into LEGAL_BIG_CONST part of the offset is going to lead to a better
> >>> code sequence.  
> >>>
> >>> So I think it would be more sensible to just return false if we have a
> >>> DImode address with the bottom 2 bits non-zero and then let the normal
> >>> reload recovery mechanism take over.
> >>
> >> It is supposed to provide better utilization of the full range of
> >> LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not sure, but I guess reload
> >> will resolve it with th

[RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
Hi,

With this patch a data-ref is marked as unconditionally read or
written also if its adjacent field is read or written unconditionally
in the loop.
My concern is that this is not safe enough, even though the fields
have to be non-pointers and non-aggregates, and this optimization is
applied only with -ftree-loop-if-convert-stores flag.

Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

OK for trunk?

Thanks,
Ira


ChangeLog:

 * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
 if an adjacent field of the data-ref is accessed unconditionally.

testsuite/ChangeLog:

 * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
 * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
 -ftree-loop-if-convert-stores.
Index: testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
===
--- testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0)
+++ testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0)
@@ -0,0 +1,69 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+#define N 50
+
+typedef struct {
+  short a;
+  short b;
+} data;
+
+data in1[N], in2[N], out[N];
+short result[N*2] = 
{10,-7,11,-6,12,-5,13,-4,14,-3,15,-2,16,-1,17,0,18,1,19,2,20,3,21,4,22,5,23,6,24,7,25,8,26,9,27,10,28,11,29,12,30,13,31,14,32,15,33,16,34,17,35,18,36,19,37,20,38,21,39,22,40,23,41,24,42,25,43,26,44,27,45,28,46,29,47,30,48,31,49,32,50,33,51,34,52,35,53,36,54,37,55,38,56,39,57,40,58,41,59,42};
+short out1[N], out2[N];
+
+__attribute__ ((noinline)) void
+foo ()
+{
+  int i;
+  short c, d;
+
+  for (i = 0; i < N; i++)
+{
+  c = in1[i].b;
+  d = in2[i].b;
+
+  if (c >= d)
+{
+  out[i].b = in1[i].a;
+  out[i].a = d + 5;
+}
+  else
+{
+  out[i].b = d - 12;
+  out[i].a = in2[i].a + d;
+}
+}
+}
+
+int
+main (void)
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+{
+  in1[i].a = i;
+  in1[i].b = i + 2;
+  in2[i].a = 5;
+  in2[i].b = i + 5;
+  __asm__ volatile ("");
+}
+
+  foo ();
+
+  for (i = 0; i < N; i++)
+{
+  if (out[i].a != result[2*i] || out[i].b != result[2*i+1])
+abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { xfail { 
vect_no_align || {! vect_strided } } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect.exp
===
--- testsuite/gcc.dg/vect/vect.exp  (revision 171716)
+++ testsuite/gcc.dg/vect/vect.exp  (working copy)
@@ -210,6 +210,12 @@ lappend DEFAULT_VECTCFLAGS "--param" "gg
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/ggc-*.\[cS\]]]  \
 "" $DEFAULT_VECTCFLAGS
 
+# -ftree-loop-if-convert-stores
+set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
+lappend DEFAULT_VECTCFLAGS "-ftree-loop-if-convert-stores"
+dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/if-cvt-stores-vect-*.\[cS\]]]  \
+"" $DEFAULT_VECTCFLAGS
+
 # With -O3.
 # Don't allow IPA cloning, because it throws our counts out of whack.
 set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
Index: tree-if-conv.c
===
--- tree-if-conv.c  (revision 171716)
+++ tree-if-conv.c  (working copy)
@@ -461,11 +461,11 @@ struct ifc_dr {
 #define DR_WRITTEN_AT_LEAST_ONCE(DR) (IFC_DR (DR)->written_at_least_once)
 #define DR_RW_UNCONDITIONALLY(DR) (IFC_DR (DR)->rw_unconditionally)
 
-/* Returns true when the memory references of STMT are read or written
-   unconditionally.  In other words, this function returns true when
-   for every data reference A in STMT there exist other accesses to
-   the same data reference with predicates that add up (OR-up) to the
-   true predicate: this ensures that the data reference A is touched
+/* Returns true when the memory references of STMT (or their adjacent fields)
+   are read or writtenunconditionally.  In other words, this function
+   returns true when for every data reference A in STMT there exist other
+   accesses to the same data reference with predicates that add up (OR-up) to
+   the true predicate: this ensures that the data reference A is touched
(read or written) on every iteration of the if-converted loop.  */
 
 static bool
@@ -479,7 +479,7 @@ memrefs_read_or_written_unconditionally 
   for (i = 0; VEC_iterate (data_reference_p, drs, i, a); i++)
 if (DR_STMT (a) == stmt)
   {
-   bool found = false;
+   bool found = false, same_comp_ref = false;
int x = DR_RW_UNCONDITIONALLY (a);
 
if (x == 0)
@@ -489,22 +489,45 @@ memrefs_read_or_written_unconditionally 
  continue;
 
for (j = 0; VEC_iterate (data_reference_p, drs, j, b); j++)
- if (DR_STMT (b) != stmt
- && sam

Re: Some remodelling of the ARM vld and vst patterns

2011-03-30 Thread Richard Sandiford
Richard Sandiford  writes:
>   The ??? is saying that the V8QI-derived MEM is really a 3-byte access,
>   not a 4-byte (SI) access, and so on.  The comment makes the mode sound
>   like a representational niceity, but really, there's no such thing as
>   a "conservatively wrong" memory size here.  If a store's mode is too
>   small, dependent loads could be deleted as dead.  If it's too big,
>   unrelated live loads could be deleted as dead.

In case it isn't obvious, I meant "unrelated live stores".

Richard


Re: Cleaning up expand optabs code

2011-03-30 Thread Richard Sandiford
Richard Henderson  writes:
> On 03/29/2011 06:21 AM, Richard Sandiford wrote:
>> -  enum machine_mode mode0 = insn_data[(int) icode].operand[1].mode;
>> -  enum machine_mode mode1 = insn_data[(int) icode].operand[2].mode;
>> -  enum machine_mode tmp_mode;
>> +  enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
>> +  enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
>> +  enum machine_mode mode0, mode1, tmp_mode;
> ...
>> -  if (GET_MODE (xop0) != mode0 && mode0 != VOIDmode)
>> -xop0 = convert_modes (mode0,
>> -  GET_MODE (xop0) != VOIDmode
>> -  ? GET_MODE (xop0)
>> -  : mode,
>> -  xop0, unsignedp);
>> -
>> -  if (GET_MODE (xop1) != mode1 && mode1 != VOIDmode)
>> -xop1 = convert_modes (mode1,
>> -  GET_MODE (xop1) != VOIDmode
>> -  ? GET_MODE (xop1)
>> -  : mode,
>> -  xop1, unsignedp);
>> +  mode0 = GET_MODE (xop0) != VOIDmode ? GET_MODE (xop0) : mode;
>> +  if (xmode0 != VOIDmode && xmode0 != mode0)
>> +{
>> +  xop0 = convert_modes (xmode0, mode0, xop0, unsignedp);
>> +  mode0 = xmode0;
>> +}
>> +
>> +  mode1 = GET_MODE (xop1) != VOIDmode ? GET_MODE (xop1) : mode;
>> +  if (xmode1 != VOIDmode && xmode1 != mode1)
>> +{
>> +  xop1 = convert_modes (xmode1, mode1, xop1, unsignedp);
>> +  mode1 = xmode1;
>> +}
>
> This smells like a target bug, but I can't quite put my finger
> on exactly what's wrong, and I haven't debugged the original.
>
> The backend has said xmode[01] is what it expects.  The only
> reason you wouldn't write xmode[01] in the create_input_operand
> line is if xmode[01] are VOIDmode.

Right.  Sorry, I should have said, but the failure was actually from:

case EXPAND_INPUT:
input:
  gcc_assert (mode != VOIDmode);

expand_binop_direct passed the match_operand's mode to create_input_operand,
which was really the opposite of the intention.  The caller should be
passing the mode of the rtx, which it should always "know".

> That seems like mistake number one, particularly for a binop,
> but I'll accept that for the nonce.  Which pattern triggered
> the problem in the target?

It was ashrdi3:

(define_expand "ashrdi3"
  [(set (match_operand:DI 0 "register_operand" "")
(ashiftrt:DI (match_operand:DI 1 "register_operand" "")
 (match_operand 2 "const_int_operand" "")))]
  "!TARGET_COLDFIRE"

Which seems reasonable in this context, since the shift count isn't
really DImode.

> Then we've got the conditionalization in the init of mode[01],
> which is presumably to handle CONST_INT as an input.
>
> What about something like
>
>   xmode0 = insn_data...
>   if (xmode0 == VOIDmode)
> xmode0 = mode;
>
>   mode0 = GET_MODE (xop0);
>   if (mode0 == VOIDmode)
> mode0 = mode;
>
>   if (xmode0 != mode0)
> convert_modes
>
>   create_input_operand(..., xmode0)
>
> ?  That seems more obvious than what you have.  And I *think*
> it should produce the same results.  If it doesn't, then this
> whole block of code needs a lot more explanation.

The problem is that a VOIDmode match_operand doesn't necessary imply
that "mode" is the right mode.  VOIDmode register-accepting operands
are only a warning, not an error, and are sometimes needed for
flag-specific variations.  They've traditionally not forced a conversion
to "mode".  E.g. if you have something like this on a 32-bit target:

  unsigned long long foo (unsigned long long x, int y)
  {
return x >>= y;
  }

then op1 will be (reg:SI y).  Having a VOIDmode match_operand shouldn't
force that to be converted to (reg:DI y), whereas I think the sequence
above would.

Or to put it another way: as things stand, "mode" is only trustworthy
for CONST_INT opNs.  A VOIDmode match_operand doesn't imply that the
opN argument to expand_binop_directly is a CONST_INT, or even that
only CONST_INTs are acceptable to the target.

This comes back to the point that we really should know up-front what
modes op0 and op1 actually have.  (The thing I left as a future clean-up.)
Until then, the process implemented by yesterday's patch was supposed to be:

   - work out what mode opN actually has at this point in time
   - see if the target has specifically asked for a different mode
   - if so, convert the operand

This is directly equivalent to what create_convert_operand_from does:

case EXPAND_CONVERT_FROM:
  if (GET_MODE (op->value) != VOIDmode)
mode = GET_MODE (op->value);
  else
/* The caller must tell us what mode this value has.  */
gcc_assert (mode != VOIDmode);

  imode = insn_data[(int) icode].operand[opno].mode;
  if (imode != VOIDmode && imode != mode)
{
  op->value = convert_modes (imode, mode, op->value, op->unsigned_p);
  mode = imode;
}

But we have to break the flow there (rathe

Re: [patch, ARM] Fix PR48250, adjust DImode reload address legitimizing

2011-03-30 Thread Chung-Lin Tang
On 2011/3/30 上午 12:23, Richard Earnshaw wrote:
> 
> On Tue, 2011-03-29 at 22:53 +0800, Chung-Lin Tang wrote:
>> On 2011/3/29 下午 10:26, Richard Earnshaw wrote:
>>> On Tue, 2011-03-29 at 18:25 +0800, Chung-Lin Tang wrote:
 On 2011/3/24 06:51 PM, Richard Earnshaw wrote:
>
> On Thu, 2011-03-24 at 12:56 +0900, Chung-Lin Tang wrote:
>> Hi,
>> PR48250 happens under TARGET_NEON, where DImode is included within the
>> valid NEON modes. This turns the range of legitimate constant indexes to
>> step-4 (coproc load/store), thus arm_legitimize_reload_address() when
>> trying to decompose the [reg+index] reload address into
>> [(reg+index_high)+index_low], can cause an ICE later when 'index_low'
>> part is not aligned to 4.
>>
>> I'm not sure why the current DImode index is computed as:
>> low = ((val & 0xf) ^ 0x8) - 0x8;  the sign-extending into negative
>> values, then subtracting back, actually creates further off indexes.
>> e.g. in the supplied testcase, [sp+13] was turned into [(sp+16)-3].
>>
>
> Hysterical Raisins... the code there was clearly written for the days
> before we had LDRD in the architecture.  At that time the most efficient
> way to load a 64-bit object was to use the LDM{ia,ib,da,db}
> instructions.  The computation here was (I think), intended to try and
> make the most efficient use of an add/sub instruction followed by
> LDM/STM offsetting.  At that time the architecture had no unaligned
> access either, so dealing with 64-bit that were less than 32-bit aligned
> (in those days 32-bit was the maximum alignment) probably wasn't
> considered, or couldn't even get through to reload.
>

 I see it now. The code in output_move_double() returning assembly for
 ldm/stm(db/da/ib) for offsets -8/-4/+4 seems to confirm this.

 I have changed the patch to let the new code handle the TARGET_LDRD case
 only.  The pre-LDRD case is still handled by the original code, with an
 additional & ~0x3 for aligning the offset to 4.

 I've also added a comment for the pre-TARGET_LDRD case. Please see if
 the description is accurate enough.

>> My patch changes the index decomposing to a more straightforward way; it
>> also sort of outlines the way the other reload address indexes are
>> broken by using and-masks, is not the most effective.  The address is
>> computed by addition, subtracting away the parts to obtain low+high
>> should be the optimal way of giving the largest computable index range.
>>
>> I have included a few Thumb-2 bits in the patch; I know currently
>> arm_legitimize_reload_address() is only used under TARGET_ARM, but I
>> guess it might eventually be turned into TARGET_32BIT.
>>
>
> I think this needs to be looked at carefully on ARMv4/ARMv4T to check
> that it doesn't cause regressions there when we don't have LDRD in the
> instruction set.

 I'll be testing the modified patch under an ARMv4/ARMv4T configuration.
 Okay for trunk if no regressions?

 Thanks,
 Chung-Lin

PR target/48250
* config/arm/arm.c (arm_legitimize_reload_address): Adjust
DImode constant index decomposing under TARGET_LDRD. Clear
lower two bits for NEON, Thumb-2, and !TARGET_LDRD. Add
comment for !TARGET_LDRD case.
>>>
>>> This looks technically correct, but I can't help feeling that trying to
>>> deal with the bottom two bits being non-zero is not really worthwhile.
>>> This hook is an optimization that's intended to generate better code
>>> than the default mechanisms that reload provides.  It is allowed to
>>> fail, but it must say so if it does (by returning false).
>>>
>>> What this hook is trying to do for DImode is to take an address of the
>>> form (reg + TOO_BIG_CONST) and turn it into two instructions that are
>>> legitimate:
>>>
>>> tmp = (REG + LEGAL_BIG_CONST)
>>> some_use_of (mem (tmp + SMALL_LEGAL_CONST))
>>>
>>> The idea behind the optimization is that LEGAL_BIG_CONST will need fewer
>>> instructions to generate than TOO_BIG_CONST.  It's unlikely (probably
>>> impossible in ARM state) that pushing the bottom two bits of the address
>>> into LEGAL_BIG_CONST part of the offset is going to lead to a better
>>> code sequence.  
>>>
>>> So I think it would be more sensible to just return false if we have a
>>> DImode address with the bottom 2 bits non-zero and then let the normal
>>> reload recovery mechanism take over.
>>
>> It is supposed to provide better utilization of the full range of
>> LEGAL_BIG_CONST+SMALL_LEGAL_CONST.  I am not sure, but I guess reload
>> will resolve it with the reg+LEGAL_BIG_CONST part only, using only (mem
>> (reg)) for the load/store (correct me if I'm wrong).
>>
>> Also, the new code slighty improves the reloading, for example currently
>> [reg+64] is broken into [(reg+72)-8], creating an additional unneeded