[arm-embedded] Request to backport thumb1 far jump patch to embedded 4.8 branch

2013-08-05 Thread Terry Guo
Hello Joey,

The thumb1 far jump patch is about an optimization to avoid unnecessary lr
save instruction. It is now in trunk. Is it OK to back port it to embedded
4.8 branch?

BR,
Terry

gcc/ChangeLog.arm

 2013-08-05  Terry Guo  
 
Backport from mainline r197956
2013-04-15  Joey Ye  

* config/arm/arm.c (thumb1_final_prescan_insn): Assert lr save
for real far jump.
(thumb_far_jump_used_p): Count instruction size and set
far_jump_used.

gcc/testsuite/ChangeLog.arm

2013-08-05  Terry Guo  

Backport from mainline r197956
2013-04-15  Joey Ye  

* gcc.target/arm/thumb1-far-jump-1.c: New test.
* gcc.target/arm/thumb1-far-jump-2.c: New test.Index: gcc/ChangeLog.arm
===
--- gcc/ChangeLog.arm   (revision 201517)
+++ gcc/ChangeLog.arm   (working copy)
@@ -1,5 +1,15 @@
 2013-08-05  Terry Guo  
 
+   Backport from mainline r197956
+   2013-04-15  Joey Ye  
+
+   * config/arm/arm.c (thumb1_final_prescan_insn): Assert lr save
+   for real far jump.
+   (thumb_far_jump_used_p): Count instruction size and set
+   far_jump_used.
+
+2013-08-05  Terry Guo  
+
Backport from mainline r197153
2013-03-27  Terry Guo  
 
Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c
===
--- gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0)
+++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(working copy)
@@ -0,0 +1,34 @@
+/* Check for thumb1 far jump. Shouldn't save lr for small leaf functions
+ * even with a branch in it.  */
+/* { dg-options "-Os" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+void f()
+{
+  for (;;);
+}
+
+volatile int g;
+void f2(int i)
+{
+  if (i) g=0;
+}
+
+void f3(int i)
+{
+  if (i) {
+g=0;
+g=1;
+g=2;
+g=3;
+g=4;
+g=5;
+g=6;
+g=7;
+g=8;
+g=9;
+  }
+}
+
+/* { dg-final { scan-assembler-not "push.*lr" } } */
+
Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c
===
--- gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0)
+++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(working copy)
@@ -0,0 +1,57 @@
+/* Check for thumb1 far jump. This is the extreme case that far jump
+ * will be used with minimum number of instructions. By passing this case
+ * it means the heuristic of saving lr for far jump meets the most extreme
+ * requirement.  */
+/* { dg-options "-Os" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+volatile register r4 asm("r4");
+void f3(int i)
+{
+#define GO(n) \
+  extern volatile int g_##n; \
+  r4=(int)&g_##n;
+
+#define GO8(n) \
+  GO(n##_0) \
+  GO(n##_1) \
+  GO(n##_2) \
+  GO(n##_3) \
+  GO(n##_4) \
+  GO(n##_5) \
+  GO(n##_6) \
+  GO(n##_7)
+
+#define GO64(n) \
+  GO8(n##_0) \
+  GO8(n##_1) \
+  GO8(n##_2) \
+  GO8(n##_3) \
+  GO8(n##_4) \
+  GO8(n##_5) \
+  GO8(n##_6) \
+  GO8(n##_7) \
+
+#define GO498(n) \
+  GO64(n##_0) \
+  GO64(n##_1) \
+  GO64(n##_2) \
+  GO64(n##_3) \
+  GO64(n##_4) \
+  GO64(n##_5) \
+  GO64(n##_6) \
+  GO8(n##_0) \
+  GO8(n##_1) \
+  GO8(n##_2) \
+  GO8(n##_3) \
+  GO8(n##_4) \
+  GO8(n##_5) \
+  GO(n##_0) \
+  GO(n##_1) \
+
+  if (i) {
+GO498(0);
+  }
+}
+
+/* { dg-final { scan-assembler "push.*lr" } } */
Index: gcc/testsuite/ChangeLog.arm
===
--- gcc/testsuite/ChangeLog.arm (revision 0)
+++ gcc/testsuite/ChangeLog.arm (working copy)
@@ -0,0 +1,7 @@
+2013-08-05  Terry Guo  
+
+   Backport from mainline r197956
+   2013-04-15  Joey Ye  
+
+   * gcc.target/arm/thumb1-far-jump-1.c: New test.
+   * gcc.target/arm/thumb1-far-jump-2.c: New test.
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 201517)
+++ gcc/config/arm/arm.c(working copy)
@@ -22577,6 +22577,11 @@
   else if (conds != CONDS_NOCOND)
cfun->machine->thumb1_cc_insn = NULL_RTX;
 }
+
+/* Check if unexpected far jump is used.  */
+if (cfun->machine->lr_save_eliminated
+&& get_attr_far_jump (insn) == FAR_JUMP_YES)
+  internal_error("Unexpected thumb1 far jump");
 }
 
 int
@@ -22602,6 +22607,8 @@
 thumb_far_jump_used_p (void)
 {
   rtx insn;
+  bool far_jump = false;
+  unsigned int func_size = 0;
 
   /* This test is only important for leaf functions.  */
   /* assert (!leaf_function_p ()); */
@@ -22657,6 +22664,26 @@
  && get_attr_far_jump (insn) == FAR_JUMP_YES
  )
{
+ far_jump = true;
+   }
+  func_size += get_attr_length (insn);
+}
+
+  /* Attribute far_jump will always be true for thumb1 before
+ shorten_branch pass.  So checking far_jump attribute before
+ shorten_branch isn't much useful.
+
+ Following heuristic tries to estimate more accr

[PATCH] Convert more passes to new dump framework

2013-08-05 Thread Teresa Johnson
This patch ports messages to the new dump framework, specifically those
involving missing/mismatched/corrupted profile data, indirect call
promotions performed, and inlines. For the inline messages, I ported
Dehao's patch from the google branches that enables printing call chain
information.

I also fixed some issues with the dump_loc support for dumping out
the new messages with location data, specifically to emit column numbers
and to ensure the notes start on a new line.

Finally, I made a couple enhancements to the new dump infrastructure
to enable messages emitted from other parts of the compiler. There
were two issues:
1) The coverage_init routine happens very early, before entering the pass
manager that sets up the dumping. This isn't an issue in trunk yet,
since there currently aren't any messages emitted during coverage_init
(this showed up on the google branch where we emit messages for
LIPO-specific profiles during this part of the compile), but it could be
in the future. This was addressed by enabling dumping within coverage_init,
and is modeled on how dumping is enabled after the pass manager
during finish_optimization_passes.
2) Dumping was only enabled for passes marked with an optinfo_flag
that isn't OPTGROUP_NONE. Currently optgroup flags are only setup for
optimization groups in the categories IPA, LOOP, INLINE and VEC.
What this means is that any dump messages added to a pass that isn't
in one of these groups will silently be suppressed. The OPTGROUP
setting is specified in opt_pass struct, and is also automatically set
to OPTGROUP_IPA for any pass starting with "ipa-". What I did was
to add a new optgroup macro, OPTGROUP_OTHER. This is enabled only
under -fopt-info-optall, which is also the default for -fopt-info.
That way dump messages can be emitted without the requirement that
the pass be part of a group that can be emitted under a specific
optimization group subset. When setting up the dumps, any pass that
has OPTGROUP_NONE after examining the opt_pass struct and the pass
name will use OPTGROUP_OTHER. This doesn't mean that the list of
optgroups shouldn't be expanded, but rather adds a catch-all for
passes that don't currently have or need to be emitted on their own
as part of a new optgroup.

Bootstrapped and tested on x86-64-unknown-linux-gnu.

Ok for trunk?

Thanks,
Teresa


2013-08-06  Teresa Johnson  
Dehao Chen  

* dumpfile.c (dump_loc): Add column number to output, make newlines
consistent.
* dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
* ipa-inline-transform.c (clone_inlined_nodes):
(cgraph_node_opt_info): New function.
(cgraph_node_call_chain): Ditto.
(dump_inline_decision): Ditto.
(inline_call): Invoke dump_inline_decision.
* doc/invoke.texi: Document optall -fopt-info flag.
* profile.c (read_profile_edge_counts): Use new dump framework.
(compute_branch_probabilities): Ditto.
* passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
when pass not in any opt group.
* value-prof.c (check_counter): Use new dump framework.
(find_func_by_funcdef_no): Ditto.
(check_ic_target): Ditto.
* coverage.c (get_coverage_counts): Ditto.
(coverage_init): Setup new dump framework.
* ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
* ipa-inline.h (is_in_ipa_inline): Declare.

* testsuite/gcc.dg/pr40209.c: Use -fopt-info.
* testsuite/gcc.dg/pr26570.c: Ditto.
* testsuite/gcc.dg/pr32773.c: Ditto.
* testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.

Index: dumpfile.c
===
--- dumpfile.c  (revision 201461)
+++ dumpfile.c  (working copy)
@@ -257,16 +257,18 @@ dump_open_alternate_stream (struct dump_file_info
 void
 dump_loc (int dump_kind, FILE *dfile, source_location loc)
 {
-  /* Currently vectorization passes print location information.  */
   if (dump_kind)
 {
+  /* Ensure dump message starts on a new line.  */
+  fprintf (dfile, "\n");
   if (LOCATION_LOCUS (loc) > BUILTINS_LOCATION)
-fprintf (dfile, "\n%s:%d: note: ", LOCATION_FILE (loc),
- LOCATION_LINE (loc));
+fprintf (dfile, "%s:%d:%d: note: ", LOCATION_FILE (loc),
+ LOCATION_LINE (loc), LOCATION_COLUMN (loc));
   else if (current_function_decl)
-fprintf (dfile, "\n%s:%d: note: ",
+fprintf (dfile, "%s:%d:%d: note: ",
  DECL_SOURCE_FILE (current_function_decl),
- DECL_SOURCE_LINE (current_function_decl));
+ DECL_SOURCE_LINE (current_function_decl),
+ DECL_SOURCE_COLUMN (current_function_decl));
 }
 }

Index: dumpfile.h
===
--- dumpfile.h  (revision 201461)
+++ dumpfile.h  (working copy)
@@ -97,8 +97,9 @@ enum tree

RE: [arm-embedded] Request to back port Cortex-R7 option support patch

2013-08-05 Thread Joey Ye
OK to embedded 4.8 branch.

> -Original Message-
> From: Terry Guo
> Sent: Tuesday, August 06, 2013 11:59
> To: Joey Ye
> Cc: gcc-patches@gcc.gnu.org
> Subject: [arm-embedded] Request to back port Cortex-R7 option support
> patch
> 
> Hi Joey,
> 
> Attached patch is a backport to support cortex-r7 in gcc command line.
> Tested and it works.
> 
> Is it OK to commit?
> 
> BR,
> Terry
> 
> 2013-08-05  Terry Guo  
> 
>   Backport from mainline r197153
>   2013-03-27  Terry Guo  
> 
>   * config/arm/arm-cores.def: Added core cortex-r7.
>   * config/arm/arm-tune.md: Regenerated.
>   * config/arm/arm-tables.opt: Regenerated.
>   * doc/invoke.texi: Added entry for core cortex-r7.






[arm-embedded] Request to back port Cortex-R7 option support patch

2013-08-05 Thread Terry Guo
Hi Joey,

Attached patch is a backport to support cortex-r7 in gcc command line.
Tested and it works.

Is it OK to commit?

BR,
Terry

2013-08-05  Terry Guo  

Backport from mainline r197153
2013-03-27  Terry Guo  

* config/arm/arm-cores.def: Added core cortex-r7.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm-tables.opt: Regenerated.
* doc/invoke.texi: Added entry for core cortex-r7.
Index: gcc/config/arm/arm-tables.opt
===
--- gcc/config/arm/arm-tables.opt   (revision 201479)
+++ gcc/config/arm/arm-tables.opt   (working copy)
@@ -259,6 +259,9 @@
 Enum(processor_type) String(cortex-r5) Value(cortexr5)
 
 EnumValue
+Enum(processor_type) String(cortex-r7) Value(cortexr7)
+
+EnumValue
 Enum(processor_type) String(cortex-m4) Value(cortexm4)
 
 EnumValue
Index: gcc/config/arm/arm-cores.def
===
--- gcc/config/arm/arm-cores.def(revision 201479)
+++ gcc/config/arm/arm-cores.def(working copy)
@@ -132,6 +132,7 @@
 ARM_CORE("cortex-r4",cortexr4, 7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f",   cortexr4f,7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",cortexr5, 7R,  
FL_LDSCHED | FL_ARM_DIV, cortex)
+ARM_CORE("cortex-r7",cortexr7, 7R,  
FL_LDSCHED | FL_ARM_DIV, cortex)
 ARM_CORE("cortex-m4",cortexm4, 7EM, 
FL_LDSCHED, cortex)
 ARM_CORE("cortex-m3",cortexm3, 7M,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-m1",cortexm1, 6M,  
FL_LDSCHED, v6m)
Index: gcc/config/arm/arm-tune.md
===
--- gcc/config/arm/arm-tune.md  (revision 201479)
+++ gcc/config/arm/arm-tune.md  (working copy)
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from arm-cores.def
 (define_attr "tune"
-   
"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
+   
"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexr7,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
Index: gcc/ChangeLog.arm
===
--- gcc/ChangeLog.arm   (revision 201479)
+++ gcc/ChangeLog.arm   (working copy)
@@ -1,3 +1,13 @@
+2013-08-05  Terry Guo  
+
+   Backport from mainline r197153
+   2013-03-27  Terry Guo  
+
+   * config/arm/arm-cores.def: Added core cortex-r7.
+   * config/arm/arm-tune.md: Regenerated.
+   * config/arm/arm-tables.opt: Regenerated.
+   * doc/invoke.texi: Added entry for core cortex-r7.
+
 2013-07-24  Terry Guo  
 
* configure.ac (with_multilib_list): Export its value.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 201479)
+++ gcc/doc/invoke.texi (working copy)
@@ -11264,7 +11264,7 @@
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, 
@samp{arm1176jzf-s},
 @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9}, 
 @samp{cortex-a15}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
-@samp{cortex-m4}, @samp{cortex-m3},
+@samp{cortex-r7}, @samp{cortex-m4}, @samp{cortex-m3},
 @samp{cortex-m1},
 @samp{cortex-m0},
 @samp{cortex-m0plus},

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-05 Thread Teresa Johnson
On Mon, Aug 5, 2013 at 7:57 AM, Teresa Johnson  wrote:
> On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka  wrote:
>> The patch looks OK to me in general (I can not approve it).
>> Still have one question...
>>> +
>>> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
>>> +   post-dominance DIR, for example as a result of edge weight insanities.
>>> +   Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
>>> +   to BBS_IN_HOT_PARTITION.  */
>>> +
>>> +static unsigned int
>>> +sanitize_dominator_hotness (enum cdi_direction dir, unsigned int 
>>> cold_bb_count,
>>> +vec *bbs_in_hot_partition)
>>> +{
>>> +  /* Callers check this.  */
>>> +  gcc_checking_assert (cold_bb_count);
>>> +
>>> +  bool dom_calculated_here = !dom_info_available_p (dir);
>>> +
>>> +  if (dom_calculated_here)
>>> +calculate_dominance_info (dir);
>>> +
>>> +  /* Keep examining hot bbs while we still have some left to check
>>> + and there are remaining cold bbs.  */
>>> +  vec hot_bbs_to_check = bbs_in_hot_partition->copy ();
>>> +  while (! hot_bbs_to_check.is_empty ()
>>> + && cold_bb_count)
>>> +{
>>> +  basic_block bb = hot_bbs_to_check.pop ();
>>> +  basic_block dom_bb = get_immediate_dominator (dir, bb);
>>> +
>>> +  /* If bb's immediate dominator is also hot (or unpartitioned,
>>> + e.g. the entry block) then it is ok. If it is cold, it
>>> + needs to be adjusted.  */
>>> +  if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
>>> +continue;
>>
>> What will hapepn on
>>
>> if (t)
>>   something
>> else
>>   something else
>> for (i=0;i<100;i++)
>>   something else2
>>
>> I would expect if/something and something else to be cold by profile 
>> feedback.
>> Your dominator code will bring the if into hot partition but both paths
>> of conditional will be cold, so the number of crossings will actually grow.
>
> You are right, this case will not be handled well.
>
>>
>> If we want to have at least some path to hot blocks in the hot region, I 
>> suspect
>> we could walk back from hot regions to entry and keep those in hot regions 
>> rather
>> than relying on the dominator tree...
>> But I am sure such things can be dealt with incrementally.
>
> I am going to fix this and will resend the patch. Rather than look at
> the immediate dominator of each hot block, we need to ensure that at
> least one pred bb is hot. In your example, if that was a 50-50 branch,
> then IMO both preds should be marked hot.

New patch below that walks the preds of each hot bb instead of the dominators.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Also ensured that
a profiledbootstrap passed with -freorder-blocks-and-partition enabled
still works.

Ok for trunk?

Thanks,
Teresa


2013-08-05  Teresa Johnson  
Steven Bosscher  

* cfgrtl.c (fixup_new_cold_bb): New routine.
(commit_edge_insertions): Invoke fixup_partitions.
(find_partition_fixes): New routine.
(fixup_partitions): Ditto.
(verify_hot_cold_block_grouping): Update comments.
(rtl_verify_edges): Invoke find_partition_fixes.
(rtl_verify_bb_pointers): Update comments.
(rtl_verify_bb_layout): Ditto.
* basic-block.h (fixup_partitions): Declare.
* cfgcleanup.c (try_optimize_cfg): Invoke fixup_partitions.
* bb-reorder.c (sanitize_hot_paths): New function.
(find_rarely_executed_basic_blocks_and_crossing_edges): Invoke
sanitize_hot_paths.

Index: cfgrtl.c
===
--- cfgrtl.c(revision 201461)
+++ cfgrtl.c(working copy)
@@ -1341,6 +1341,43 @@ fixup_partition_crossing (edge e)
 }
 }

+/* Called when block BB has been reassigned to the cold partition,
+   because it is now dominated by another cold block,
+   to ensure that the region crossing attributes are updated.  */
+
+static void
+fixup_new_cold_bb (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  /* This is called when a hot bb is found to now be dominated
+ by a cold bb and therefore needs to become cold. Therefore,
+ its preds will no longer be region crossing. Any non-dominating
+ preds that were previously hot would also have become cold
+ in the caller for the same region. Any preds that were previously
+ region-crossing will be adjusted in fixup_partition_crossing.  */
+  FOR_EACH_EDGE (e, ei, bb->preds)
+{
+  fixup_partition_crossing (e);
+}
+
+  /* Possibly need to make bb's successor edges region crossing,
+ or remove stale region crossing.  */
+  FOR_EACH_EDGE (e, ei, bb->succs)
+{
+  /* We can't have fall-through edges across partition boundaries.
+ Note that force_nonfallthru will do any necessary partition
+ boundary fixup by calling fixup_partition_crossing itself.  */
+  if ((e->flags & EDGE_FALLTHRU)
+  && BB_PARTITION (bb) != BB_PARTITION (e->dest)
+  &

Re: [C++ Patch / RFC] PR 46206

2013-08-05 Thread Jason Merrill

On 08/05/2013 06:46 PM, Paolo Carlini wrote:

and after this comment, both pairs of qualify_lookup are called in that
order. Thus I started seriously suspecting that something may be wrong
in the if-else above, that is, that we really want something with
iter->type *before* iter->value there too: the attached patchlet p works
for the testcase and passes bootstrap & test. Does it make sense to you?


Yes.


Final observation: in many cases, like for example, variants of the
testcase with one less data member, what happens is that iter->type and
iter->value are *both* the same variant of the TYPE_DECL Bar, the one
which is fine, has DECL_IMPLICIT_TYPEDEF_P set.


That's strange.  I would expect that to mean that we don't properly give 
an error for a Bar data member declared after the typedef.


Jason



Re: [PATCH/Merge Request] Vtable Verification feature.

2013-08-05 Thread David Malcolm
On Mon, 2013-08-05 at 15:24 -0700, Diego Novillo wrote:
> This looks almost ready to commit.  Some comments below:

[...]

> > +/* Definition of this optimization pass.  */
> > +
> > +struct gimple_opt_pass pass_vtable_verify =
> > +{
> > + {
> > +  GIMPLE_PASS,
> > +  "vtable-verify",  /* name */
> > +  OPTGROUP_NONE,/* optinfo_flags */
> > +  gate_tree_vtable_verify,  /* gate */
> > +  vtable_verify_main,   /* execute */
> > +  NULL, /* sub */
> > +  NULL, /* next */
> > +  0,/* static_pass_number */
> > +  TV_VTABLE_VERIFICATION,   /* tv_id */
> > +  PROP_cfg | PROP_ssa,  /* properties_required */
> > +  0,/* properties_provided */
> > +  0,/* properties_destroyed */
> > +  0,/* todo_flags_start */
> > +  TODO_update_ssa   /* todo_flags_finish */
> > + }
> > +};
> 
> So, David M (CC'd) has just pulled the rug from under you a little
(Looks like I wasn't CCed, but I happened to see this given we were
chatting on IRC about it; have added myself to CC now)

> bit.  The pass manager now has a different API.  The changes are not
> too drastic, but you'll need to rework the registration of the pass.
> (http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00231.html).

I'm sorry for the inconvenience; a lot changed in trunk with these
changes.

A description of the API change can be seen in this mail:
http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01259.html
and perhaps by reading:
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor_passes.py

> David wrote a refactoring tool to use, but this being a single pass it
> should be easy to convert it manually.  See the other passes, and I'm
> sure David will be only too happy to help you if you run into trouble.

The script is "refactor_passes.py" within:
https://github.com/davidmalcolm/gcc-refactoring-scripts

I had a go at running the script on your branch; currently it fails
with:
  File "refactor_passes.py", line 372, in make_replacement2
assert extra != d[extra]
AssertionError
which is an overly terse way of reporting that a callback in an IPA pass
has the same identifier name as the name of the corresponding field,
which leads to the code breaking under the new API: specifically, I
believe the branch is missing the commit that's r201020 on trunk:

2013-07-18  David Malcolm  

* ipa-pure-const.c (generate_summary): Rename to...
(pure_const_generate_summary): ... this.

(see http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00687.html for more
details on this one).

On running it on just vtable-verify.c thusly:
  $ python refactor_passes.py ../src/gcc/vtable-verify.c 
it generated the attached patch, which looks correct to me (I haven't
attempted to remerge the branches to see if it compiles, but it looks
like it should); I kept the generated ChangeLog in the patchfile.

You'll also need to change the extern decl in tree-pass.h:
  extern struct gimple_opt_pass pass_vtable_verify;
to that of the factory function:
  extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
(the refactoring script will also do this, but currently in the branch
there are ~200 other passes that also get touched, so the specific
change would be lost in the noise).

Note that the vtable_verify pass appears to be single-instanced within
the pass tree.  For future reference, if someone wants to run the script
on a *multi-instanced* pass, they'll need to supply a clone method (the
refactor script only adds them for those that it "knows" are
multi-instanced, using a hardcoded list).  Writing a dummy clone method
is trivial, though.

> > Index: gcc/passes.def
> > ===
> > --- gcc/passes.def (revision 201377)
> > +++ gcc/passes.def (working copy)
> > @@ -292,6 +292,7 @@ along with GCC; see the file COPYING3.
> >NEXT_PASS (pass_tm_memopt);
> >NEXT_PASS (pass_tm_edges);
> >POP_INSERT_PASSES ()
> > +  NEXT_PASS (pass_vtable_verify);
> 
> This will also need changes after David's changes.

If I reading this right, I don't think so - I think the above ought to
just work; r201505 updated the meaning of NEXT_PASS (actually for
pass-instances.def) so that it will use make_PASS_NAME i.e. this becomes
a call to "make_pass_vtable_verify" (wrapped in other stuff).

Hope this is helpful; sorry again for the inconvenience.
Dave
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f357e85..f1d8aed 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2013-08-05  David Malcolm  
+
+	Patch autogenerated by refactor_passes.py from
+	https://github.com/davidmalcolm/gcc-refactoring-scripts
+	revision 03fe39476a4c4ea450b49e087cfa817b5f92021e
+
+	*

Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-05 Thread Richard Henderson
On 08/05/2013 12:32 PM, Oleg Endo wrote:
> Thanks, committed as rev 201513.
> 4.8 also has the same problem.  The patch applies on 4.8 branch without
> problems and make all-gcc works.
> OK for 4.8, too?

Hum.  I suppose so, since it's relatively self-contained.  I suppose the
out-of-tree openrisc port will thank us...


r~


[C++ Patch / RFC] PR 46206

2013-08-05 Thread Paolo Carlini

Hi,

I have been investigating this very old and very weird issue where we 
wrongly reject:


class Foo
{
int u, v, w, x;
typedef struct Bar { } Bar;
virtual void foo(void) {
struct Bar bar;
}
};

46206.C: In member function ‘virtual void Foo::foo()’:
46206.C:6:12: error: using typedef-name ‘Foo::Bar’ after ‘struct’
46206.C:4:26: note: ‘Foo::Bar’ has a previous declaration here

whereas we don't reject variants with one less data member, say x, or 
non-virtual foo, all sorts of variants corresponding to a smaller Foo!!


I figured out that when we parse "typedef struct Bar { } Bar;" we create 
two TYPE_DECL: first, one marked as DECL_IMPLICIT_TYPEDEF_P in pushtag_1 
(via create_implicit_typedef); then a second, non-implicit, one in 
grokdeclarator, via build_lang_decl (TYPE_DECL... ). When we do lookup 
for "struct Bar bar", it happens that the *second* one is found, thus 
the check in check_elaborated_type_specifier triggers.


The latter function is called by lookup_and_check_tag, with the decl 
returned by lookup_name_prefer_type (name, 2). The latter, in turn, ends 
up calling lookup_name_real_1 which has:


/* If this is the kind of thing we're looking for, we're done. */
if (qualify_lookup (iter->value, flags))
binding = iter->value;
else if ((flags & LOOKUP_PREFER_TYPES)
&& qualify_lookup (iter->type, flags))
binding = iter->type;
else
binding = NULL_TREE;

and it happens that the first qualify_lookup succeeds but with 
iter->value which is the variant of the Bar TYPE_DECL with 
DECL_IMPLICIT_TYPEDEF_P not set. On the other hand, iter->type is the Ok 
variant, that which would not trigger the error.


Then I noticed the following comment in name_lookup.c, around line 4890, 
in lookup_type_scope_1:


We check ITER->TYPE before ITER->VALUE in order to handle
typedef struct C {} C;
correctly. */

and after this comment, both pairs of qualify_lookup are called in that 
order. Thus I started seriously suspecting that something may be wrong 
in the if-else above, that is, that we really want something with 
iter->type *before* iter->value there too: the attached patchlet p works 
for the testcase and passes bootstrap & test. Does it make sense to you?


Final observation: in many cases, like for example, variants of the 
testcase with one less data member, what happens is that iter->type and 
iter->value are *both* the same variant of the TYPE_DECL Bar, the one 
which is fine, has DECL_IMPLICIT_TYPEDEF_P set. Thus the ordering 
doesn't matter. Frankly, at the moment I'm not sure to understand how 
exactly when the class becomes bigger the iter->type and iter->value 
become different and becomes important to handle the former first.


Fi ;)

Thanks!
Paolo.

/
Index: name-lookup.c
===
--- name-lookup.c   (revision 201511)
+++ name-lookup.c   (working copy)
@@ -4740,11 +4740,11 @@ lookup_name_real_1 (tree name, int prefer_type, in
  continue;
 
/* If this is the kind of thing we're looking for, we're done.  */
-   if (qualify_lookup (iter->value, flags))
+   if ((flags & LOOKUP_PREFER_TYPES)
+   && qualify_lookup (iter->type, flags))
+ binding = iter->type;
+   else if (qualify_lookup (iter->value, flags))
  binding = iter->value;
-   else if ((flags & LOOKUP_PREFER_TYPES)
-&& qualify_lookup (iter->type, flags))
- binding = iter->type;
else
  binding = NULL_TREE;
 


Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-05 Thread Oleg Endo
On Mon, 2013-08-05 at 11:42 -1000, Richard Henderson wrote:
> On 07/27/2013 02:52 AM, Oleg Endo wrote:
> > gcc/ChangeLog:
> > * recog.h (rtx (*insn_gen_fn) (rtx, ...)): Replace typedef with
> > new class insn_gen_fn.
> > * expr.c (move_by_pieces_1, store_by_pieces_2): Replace
> > argument rtx (*) (rtx, ...) with insn_gen_fn.
> > * genoutput.c (output_insn_data): Cast gen_? function pointers
> > to insn_gen_fn::stored_funcptr.  Add initializer braces.
> 
> Ok.
> 

Thanks, committed as rev 201513.
4.8 also has the same problem.  The patch applies on 4.8 branch without
problems and make all-gcc works.
OK for 4.8, too?

Cheers,
Oleg



Re: [PATCH/Merge Request] Vtable Verification feature.

2013-08-05 Thread Diego Novillo
This looks almost ready to commit.  Some comments below:

Once this is committed, you should write a blurb in GCC's home page
describing the contribution.

> ===
> --- libgcc/vtv_start_preinit.c (revision 0)
> +++ libgcc/vtv_start_preinit.c (revision 0)
> @@ -0,0 +1,68 @@
> +/*  Copyright (C) 2012, 2013
> +Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */

Shouldn't this file have the GPL+exception license?

> +
> +/* This file is part of the vtable verification feature (for a
> +   detailed description of the feature, see comments in
> +   tree-vtable-verify.c).  The vtable verification feature creates

s/tree-vtable-verify.c/vtable-verify.c/

> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.

Ditto this one and all the other new files in libgcc/

> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* This file is part of the vtable verification feature (for a
> +   detailed description of the feature, see comments in
> +   tree-vtable-verify.c).  The vtable verification feature creates

s/tree-vtable-verify.c/vtable-verify.c/

> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* This file is part of the vtable verification feature (for a
> +   detailed description of the feature, see comments in
> +   tree-vtable-verify.c).  The vtable verification feature creates

s/tree-vtable-verify.c/vtable-verify.c/

> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* This file is part of the vtable verification feature (for a
> +   detailed description of the feature, see comments in
> +   tree-vtable-verify.c).  The vtable verification feature creates

s/tree-vtable-verify.c/vtable-verify.c/

> +  The vtable verification feature is controlled by the flag
> +  '-fvtable-verify='.  There are three flavors of this:
> +  '-fvtable-verify=std', '-fvtable-verify=preinit', and
> +  '-fvtable-verify=none'.  If the option '-fvtable-verfy=preinit' is
> +  used, then our constructor initialization function gets put into the
> +  preinit array.  This is necessary if there are data sets that need
> +  to be built very early in execution.  If the constructor
> +  initialization function gets put into the preinit array, the we also
> +  add calls to __VLTChangePermission at the beginning and end of the
> +  function.  The call at the beginning sets the permissions on the
> +  data sets and vtable map variables to read/write, and the one at the
> +  end makes them read-only.  If the '-fvtable-verify=std' option is
> +  used, the constructor initialization functions are executed at their
> +  normal time, and the __VLTChangePermission calls are handled
> +  differently (see the comments in libstdc++-v3/libsupc++/vtv_rts.cc).
> +  The option '-fvtable-verify=none' turns off vtable verification.

This part of the documentation, dealing with compiler flags and
user-visible features should also be in doc/invoke.texi.

> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "tree.h"
> +#include "tm_p.h"
> +#include "basic-block.h"
> +#include "output.h"
> +#include "tree-flow.h"
> +#include "tree-dump.h"
> +#include "tree-pass.h"
> +#include "timevar.h"
> +#include "cfgloop.h"
> +#include "flags.h"
> +#include "tree-inline.h"
> +#include "tree-scalar-evolution.h"
> +#include "diagnostic-core.h"
> +#include "gimple-pretty-print.h"
> +#include "toplev.h"
> +#include "langhooks.h"
> +#include "tree-ssa-propagate.h"

Have you made sure you actually need all these headers?  There are
some that look superfluous (like tree-ssa-propagate.h).

> +
> +#include "vtable-verify.h"
> +
> +unsigned num_vtable_map_nodes = 0;
> +bool any_verification_calls_generated = false;
> +int total_num_virtual_calls = 0;
> +int total_num_verified_vcalls = 0;

I suppose all 

Re: [Patch, Fortran, OOP] PR 57306: ICE on valid with class pointer initialization

2013-08-05 Thread Tobias Burnus

Janus Weil wrote:

Ok for trunk?


Sorry for the belated review.

+  bool ptr = sym->attr.pointer || sym->attr.allocatable
+|| (sym->ts.type == BT_CLASS
+&& CLASS_DATA (sym)->attr.class_pointer);


That looks quite imbalanced. Why do you not take care of 
CLASS_DATA(sym)->attr.allocatable? Actually, shouldn't that always be 
true for BT_CLASS in this context? A BT_CLASS should either be a 
pointer/allocatable or a dummy argument - but the latter is never 
initialized (while being a dummy).


Otherwise, it looks OK to me.

(Don't forget the dg-do compile -> run change.)

From follow-up emails:

type t
   class(*), pointer :: x
end type t
type(t), target :: y
integer,target :: z
type(t) :: x = t(y)
type(t) :: x = t(z)
class(*), pointer :: a => y

Your example yields (with and without the current patch):

type(t) :: x = t(y)
  1
Error: Can't convert TYPE(t) to CLASS(*) at (1)

In fact the patch does not really handle unlimited polymorphics. I
suspect several fixes might be needed to get your test case running.
Is it ok to do this in a follow-up patch?


Seems as if there is more work required ... Yes, a follow-up patch is 
fine, but somewhere the omission should be recorded. (As you did in 
PR49213.)


Talking about the example above, the following is similar and may or may 
not be handled:


integer, target :: tgt
type t2
end type t2
type(t2), target :: tgt2
type t
  class(*), pointer :: a => tgt
  class(*), pointer :: b => tgt2
end type t
type(t) :: x
type(t), SAVE :: y
end


Regarding the example: Does it now work when the target and the pointer are
in the same scoping unit? Or does one still need to place the targets into
the module?

Well, they will work as soon as PR 55207 is fixed (there is a working
patch already, which hopefully can be committed soon).


Good to know that that is already tracked and being fixed.

Tobias



Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-05 Thread Richard Henderson
On 07/27/2013 02:52 AM, Oleg Endo wrote:
> gcc/ChangeLog:
>   * recog.h (rtx (*insn_gen_fn) (rtx, ...)): Replace typedef with
>   new class insn_gen_fn.
>   * expr.c (move_by_pieces_1, store_by_pieces_2): Replace
>   argument rtx (*) (rtx, ...) with insn_gen_fn.
>   * genoutput.c (output_insn_data): Cast gen_? function pointers
>   to insn_gen_fn::stored_funcptr.  Add initializer braces.

Ok.


r~


[PING] Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-05 Thread Oleg Endo
Hello,

Any comments?
(patch is here: http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01315.html)

Cheers,
Oleg

On Sat, 2013-07-27 at 14:52 +0200, Oleg Endo wrote:
> Hi,
> 
> On Fri, 2013-07-26 at 08:51 +0200, Uros Bizjak wrote:
> 
> > BTW: I am not c++ expert, but doesn't c++ offer some sort of
> > abstraction to get rid of
> > 
> > +  union {
> > +rtx (*argc0) (void);
> > +rtx (*argc1) (rtx);
> > +rtx (*argc2) (rtx, rtx);
> > +rtx (*argc3) (rtx, rtx, rtx);
> > +rtx (*argc4) (rtx, rtx, rtx, rtx);
> > +rtx (*argc5) (rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc6) (rtx, rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc7) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc8) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc9) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc10) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +rtx (*argc11) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +  } genfun;
> > 
> 
> Variadic templates maybe, but we can't use them since that requires C
> ++11.
> 
> Anyway, I think it's better to turn the insn_gen_fn typedef in recog.h
> into a functor.  The change is quite transparent to the users, as the
> attached patch shows.  There really is no need for things like GEN_FCN?,
> nor do we need to fixup all the backends etc.
> 
> I've tested the attached patch on my SH cross setup with 'make all-gcc'
> and it can still produce a valid 'hello world'.  Not sure whether it's
> sufficient.
> 
> Some notes regarding the patch:
> 
> * The whole arg list thing could probably be folded with x-macros or
> something like that, but I don't think it's worth doing it for this
> single case.  If we can use C++11 in GCC at some point in time in the
> future, the insn_gen_fn implementation can be folded with variadic
> templates without touching all the code that uses it.
> 
> * I had to extend the number of max. args to 16, otherwise the SH
> backend's sync.md code wouldn't compile.
> 
> * I don't know whether it's really needed to properly format the code of
> class insn_gen_fn.  After reading the first two or three overloads
> (which do fit into 80 columns) one gets the idea and so I guess nobody
> is going to read that stuff completely anyway.
> 
> * The class insn_gen_fn is a POD, so it can be passed by value without
> any overhead, just like a normal function pointer.  Same goes for the
> function call through the wrapper class.
> 
> * Initially I had overloaded constructors in class insn_gen_fn which
> worked and didn't require the cast in genoutput.c.  However, it
> introduced static initializer functions and defeated the purpose of the
> generated const struct insn_data_d insn_data[].  This is worked around
> by casting the function pointer to insn_gen_fn::stored_funcptr. (Since
> it's C++ and not C it won't implicitly cast to void*).
> 
> * The whole thing will fall apart if the stored pointer to a function
> 'rtx (*)(void)' is different from a stored pointer to e.g. 'rtx
> (*)(rtx)'.  But I guess this is not likely to happen.
> 
> Cheers,
> Oleg
> 
> gcc/ChangeLog:
>   * recog.h (rtx (*insn_gen_fn) (rtx, ...)): Replace typedef with
>   new class insn_gen_fn.
>   * expr.c (move_by_pieces_1, store_by_pieces_2): Replace
>   argument rtx (*) (rtx, ...) with insn_gen_fn.
>   * genoutput.c (output_insn_data): Cast gen_? function pointers
>   to insn_gen_fn::stored_funcptr.  Add initializer braces.




Re: [C++ Patch] PR 58080

2013-08-05 Thread Jason Merrill

OK.

Jason


Re: [PATCH 2/4] [lambda] Support implicit conversion of a stateless generic lambda to a function pointer.

2013-08-05 Thread Jason Merrill

On 08/04/2013 07:45 PM, Adam Butcher wrote:

What should I do about the symtab nullptr issue?
(http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00043.html)  Should I
leave the workaround in my patch set as a standalone commit to be
remedied later or should I try to squash it?  Or is the hack appropriate
for some reason?


Let's fix it.


   0x810670 handle_alias_pairs
   ../../gcc/cgraphunit.c:1014


This suggests that somehow the call operator template wound up in 
alias_pairs.  This is very mysterious; I really don't know why there 
would be any aliases in this testcase, much less one involving the 
operator().  The offending code should be pretty straightforward to find 
with a watchpoint on alias_pairs.


Jason



Passes are now C++ classes (was Re: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)

2013-08-05 Thread David Malcolm
On Mon, 2013-07-29 at 15:41 -0600, Jeff Law wrote:
> On 07/26/2013 09:04 AM, David Malcolm wrote:
> > This patch is the hand-written part of the conversion of passes from
> > C structs to C++ classes.  It does not work without the subsequent
> > autogenerated part, which is huge.
> [ ... ]
> With the changes from pipeline -> pass_manager this is fine.  As is the 
> follow-up autogenerated patch.

I've committed this and the other patches that needed to go with it to
trunk, with the name fix, having successfully bootstrapped and tested
them on x86_64-unknown-linux-gnu - so opt_pass is now a C++ class
hierarchy, allowing for pass instances to hold pass-specific data
(albeit not GTY refs yet), and the future possibility of multiple pass
managers and pass pipelines (e.g. for embedding GCC for
JIT-compilation).

Specifically, I fixed the names and rebased against master and tested
repeatedly (for the later GC patches, which are still being reviewed).
I refreshed the "03/11" patch (which became r201505) to resolve a
trivial conflict in cgraphunit.c, and then bootstrapped and tested the
series of 5 patches against r201494 on x86_64-unknown-linux-gnu (RHEL
6.4 in fact).

It successfully built all 3 stages, and gave the same testing results as
an unpatched build of r201494.

Given that, I've committed the following to trunk:
  * r201505: this patch (aka 03/11; handwritten part of conversion,
approved above by Jeff)
  * r201506: the zero-initialization of pass_manager (aka 3.1/11;
approved by rth in
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00147.html )
  * r201508: the big autogenerated patch (aka 04/11; approved above by
Jeff)
  * r201509: adding -fno-rtti when building plugins in the testsuite
(aka 05/11; approved by Jeff in
http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01433.html )
  * r201511: aka 06/11; using pass->clone and fixing pass switch
numbering (approved by rth in
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00054.html )

Note that r201505 and r201506 don't compile by themselves; they need
r201508 to update all of the passes to the new internal API, and r201509
and r201511 are then needed otherwise you'll see regressions in the
testsuite.

Given all of the above testing I'm reasonably confident that this works.
However this is such a large change [1] that there's a non-zero chance
of at least one glitch - let me know if you see any breakages.  One
glitch that I ran into when applying the patches to svn was that I saw
some end-of-file whitespace changes in some of the gcc/testsuite files:
I'm not sure how those crept in, but I used "svn revert" by hand on them
before committing to ensure that only the files I was expecting to
change were changed.

Some of followup patches in the series are now approved, some aren't, so
I'm next going to bootstrap/test/commit the approved ones as
appropriate.

Dave

[1] 127 files changed, 10227 insertions(+), 4465 deletions(-)



[AArch64] Fixup the vget_lane RTL patterns and intrinsics

2013-08-05 Thread James Greenhalgh

This patch fixes up the vget_lane RTL patterns to better
exploit the behaviour of their target instructions, and
to allow variants keeping the result in the SIMD register file.

We patch up aarch64_get_lane_. These are somewhat
misleading and are not being used in their full capacity.
They currently zero/sign_extend from something of size  to
something of size  which is always going to be a no-op, and
should never be introduced by the combiner. More useful would be
to have these patterns perform the cast they actually perform.
That is to say,  to SI or DI as appropriate.

So, these get renamed to aarch64_get_lane_extend, and
modified such that they return the widened type. Sadly, this means
they cannot be used purely in the SIMD register set as there is no
widen-to-32/64-bit instruction operating on this register file.

So, that leaves the case we had before. If we eliminate the no-op,
we have the same pattern as aarch64_dup_lane_scalar, so eliminate
this - it makes more sense to be called aarch64_get_lane.

And then we fix up arm_neon.h... As these are lane intrinsics we
should be a little careful. We are likely to use the vget_lane
intrinsics in composition with other intrinsics or operations,
but we must guarantee that the constant parameter is actually a
compile time constant. We define some internal wrapper macros
in arm_neon.h, which should be used in preference to calling
the raw compiler builtin.

All of this effort is required to ensure that when we use a vget_lane
intrinsic, the RTL generated is simply a vec_select. This allows us
to begin building other lane intrinsics as composites with
__aarch64_vget_lane.

We must fix the ever-troublesome scalar_intrinsics.c testcase to
teach it the new names for get_lane, but otherwise tested on
aarch64-none-elf with no regressions.

OK?

Thanks,
James

---
gcc/

2013-08-05  James Greenhalgh  

* config/aarch64/aarch64-simd-builtins.def (get_lane_signed): Remove.
(get_lane_unsigned): Likewise.
(dup_lane_scalar): Likewise.
(get_lane): enable for VALL.
* config/aarch64/aarch64-simd.md
(aarch64_dup_lane_scalar): Remove.
(aarch64_get_lane_signed): Likewise.
(aarch64_get_lane_unsigned): Likewise.
(aarch64_get_lane_extend): New.
(aarch64_get_lane_zero_extendsi): Likewise.
(aarch64_get_lane): Enable for all vector modes.
(aarch64_get_lanedi): Remove misleading constraints.
* config/aarch64/arm_neon.h
(__aarch64_vget_lane_any): Define.
(__aarch64_vget_lane_<8,16,32,64>): Likewise.
(vget_lane_<8,16,32,64>): Use __aarch64_vget_lane macros.
(vdup_lane_<8,16,32,64>): Likewise.
* config/aarch64/iterators.md (VDQQH): New.
(VDQQHS): Likewise.
(vwcore): Likewise.

gcc/testsuite/

2013-08-05  James Greenhalgh  

* gcc.target/aarch64/scalar_intrinsics.c: Update expected
output of vdup intrinsics.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 55dead6..4046d7a 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -40,10 +40,6 @@
10 - CODE_FOR_.  */
 
   BUILTIN_VD_RE (CREATE, create, 0)
-  BUILTIN_VQ_S (GETLANE, get_lane_signed, 0)
-  BUILTIN_VDQ (GETLANE, get_lane_unsigned, 0)
-  BUILTIN_VDQF (GETLANE, get_lane, 0)
-  VAR1 (GETLANE, get_lane, 0, di)
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
   BUILTIN_VDQF (UNOP, sqrt, 2)
@@ -51,6 +47,9 @@
   VAR1 (UNOP, addp, 0, di)
   VAR1 (UNOP, clz, 2, v4si)
 
+  BUILTIN_VALL (GETLANE, get_lane, 0)
+  VAR1 (GETLANE, get_lane, 0, di)
+
   BUILTIN_VD_RE (REINTERP, reinterpretdi, 0)
   BUILTIN_VDC (REINTERP, reinterpretv8qi, 0)
   BUILTIN_VDC (REINTERP, reinterpretv4hi, 0)
@@ -64,7 +63,6 @@
   BUILTIN_VQ (REINTERP, reinterpretv2df, 0)
 
   BUILTIN_VDQ_I (BINOP, dup_lane, 0)
-  BUILTIN_VDQ_I (BINOP, dup_lane_scalar, 0)
   /* Implemented by aarch64_qshl.  */
   BUILTIN_VSDQ_I (BINOP, sqshl, 0)
   BUILTIN_VSDQ_I (BINOP, uqshl, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 3c76032..9823730 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -357,20 +357,6 @@
(set_attr "simd_mode" "")]
 )
 
-(define_insn "aarch64_dup_lane_scalar"
-  [(set (match_operand: 0 "register_operand" "=w, r")
-	(vec_select:
-	  (match_operand:VDQ 1 "register_operand" "w, w")
-	  (parallel [(match_operand:SI 2 "immediate_operand" "i, i")])
-))]
-  "TARGET_SIMD"
-  "@
-   dup\\t%0, %1.[%2]
-   umov\\t%0, %1.[%2]"
-  [(set_attr "simd_type" "simd_dup, simd_movgp")
-   (set_attr "simd_mode" "")]
-)
-
 (define_insn "aarch64_simd_dup"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
 (vec_duplicate:VDQF (match_operand: 1 "register_operand" "w")))]
@@ -2147,45 +2133,50 @@
   DONE;
 })
 
-(define_insn "aarch64_get_lane_signed"
-  [(set (match_op

Re: [testsuite, Android] Add to pr56407.c

2013-08-05 Thread Maxim Kuvyrkov
On 5/08/2013, at 10:57 PM, Alexander Ivchenko wrote:

> Hi,
> 
> The following test case fails to compile on Android: gcc.dg/torture/pr56407.c
> 
> /tmp/ccA08Isw.o:pr56407.c:function test: error: undefined reference to 'rand'
> collect2: error: ld returned 1 exit status
> 
> Which is not surprising at all, since the testcase has only the
> declarations of abort() and rand()
> and doesn't have any headers included.

It *is* surprising given that the testcase does have declarations of abort() or 
rand() -- which are the only two external functions referenced.

> 
> The following patch adds  to the test.

I don't think this a correct fix.  [In most such cases the real problem can be 
found out by examining why linking a test against glibc works, but against 
bionic -- doesn't.]

How are you linking this testcase?  Please post both the link line from 
testsuite log and verbose output of linking (obtainable by adding "-v" to the 
link line).

Thank you,

--
Maxim Kuvyrkov
www.kugelworks.com




Re: New option to do fine grain control [on|off] of micro-arch specific features : -mtune-ctrl=....

2013-08-05 Thread Xinliang David Li
ok -- makes sense. This can be done as a follow up patch.

thanks,

David

On Mon, Aug 5, 2013 at 10:59 AM, H.J. Lu  wrote:
> On Mon, Aug 5, 2013 at 9:33 AM, Xinliang David Li  wrote:
>> On Mon, Aug 5, 2013 at 8:26 AM, H.J. Lu  wrote:
>>> On Sun, Aug 4, 2013 at 9:23 AM, Xinliang David Li  
>>> wrote:
 On Sun, Aug 4, 2013 at 4:40 AM, Richard Biener
  wrote:
> Xinliang David Li  wrote:
>>Hi, GCC/i386 currently has about 73 boolean parameters/knobs (defined
>>in ix86_tune_features[], indexed by ix86_tune_indices) to perform
>>micro-arch specific performance tuning. However such settings are hard
>>coded (fixed with a given -mtune setting) and is very hard to do
>>performance experiment.
>>
>>The attached patch fixes the problem. The patch introduces a new
>>option -mtune-ctrl=. Its parameter is a comma separated list of
>>feature names to turn on associated features. Feature name can be
>>prefixed by ^ to do the opposite. For instance,
>>
>>  -mtune-ctrl=prologue_using_move,epilogue_using_move,^pad_returns
>>
>>tells the compiler to use move instructions in prologue/epilogue
>>(instead of push/pop), and *not* pad return instructions.
>>
>>To facilitate the change, the feature tuning enums defined in i386.h
>>are moved to a new file x86-tune.def and this file can be used to
>>generate both the enums and names of the features.
>>
>>
>>Ok for trunk?
>>
>
> The patch fails to add documentation. And I am nervous about testing 
> coverage - is this considered a development option only or are random 
> combinations expected to work in all situations? I expect not, thus this 
> looks like a dangerous option?
>

 This option is intended to be used by developers -- otherwise we will
 have to document all possible feature knobs. I saw a couple of
 existing options in i386.opt marked as 'Undocumented' -- is this mark
 used for case like this?   Since this option is for experimental
 purpose, user certainly can shoot their foot with it :)

 If there is support of target specific --params which takes strings as
 args, it might be a better choice to use that.

>>>
>>> I have a similar patch to turn on/off each feature and it is very
>>> useful to fine tune x86 backend.  But mine is not automated.
>>> Anothing I found useful is a command line switch to turn off all
>>> features, like -mno-default.
>>
>>
>> Turn off all features or just toggle the features? What is the use
>> case for -mno-default?
>>
>
> -mno-default tunes off all features.  To turn on only features: f1,
> f2, f3, --,fN.
> we can do
>
> -mno-default -mtune-ctrl=f1,f2,,..,fN
>
> We don't need to check if other features are off since they are
> turned off by -mno-default.
>
>
> --
> H.J.


Re: [PATCH, PR 58041] Make gimple-ssa-strenght-reduction.c create MEM_REFs with alignment information

2013-08-05 Thread Richard Biener
Martin Jambor  wrote:
>Hi,
>
>PR 58041 is a misalignment bug caused by replace_ref in
>gimple-ssa-strength-reduction.c because it does not make sure that the
>MEM_REFs it creates has the proper alignment encoded in them.
>
>I'd like to fix this with the patch below, which basically does the
>same thing SRA does, it is only slightly simpler because we are
>replacing an access to memory with an access to the exact same memory,
>so we don't have to bother computing offset alignments.
>
>I was wondering whether it would make sense to put some common part of
>the code to a function and call it from replace_ref and SRA but such a
>function would probably only contain the align < TYPE_ALIGN check and
>the fold_build_2 call so I was not sure it was worth it.  However, I
>may add it as a followup, it may make future users more aware of the
>need to take care of alignment.
>
>I have bootstrapped and tested the patch below on on x86_64-linux,
>Bill Schmidt did the same on powerpc64-unknown-linux-gnu and it also
>got some testing on ARM.  OK for trunk (and I think also the 4.8
>branch needs it)?

Ok.

Thanks,
Richard.

>Thanks,
>
>Martin
>
>
>2013-08-05  Martin Jambor  
>
>   PR middle-end/58041
>   * gimple-ssa-strength-reduction.c (replace_ref): Make sure built
>   MEM_REF has proper alignment information.
>
>testsuite/
>   * gcc.dg/torture/pr58041.c: New test.
>   * gcc.target/arm/pr58041.c: Likewise.
>
>*** /tmp/5wmq7D_gimple-ssa-strength-reduction.cMon Aug  5 19:05:06
>2013
>--- gcc/gimple-ssa-strength-reduction.cMon Aug  5 18:56:10 2013
>*** dump_incr_vec (void)
>*** 1728,1738 
>  static void
>  replace_ref (tree *expr, slsr_cand_t c)
>  {
>!   tree add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE
>(c->base_expr),
>! c->base_expr, c->stride);
>!   tree mem_ref = fold_build2 (MEM_REF, TREE_TYPE (*expr), add_expr,
>!double_int_to_tree (c->cand_type, c->index));
>!   
>/* Gimplify the base addressing expression for the new MEM_REF tree. 
>*/
>gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt);
>TREE_OPERAND (mem_ref, 0)
>--- 1728,1750 
>  static void
>  replace_ref (tree *expr, slsr_cand_t c)
>  {
>!   tree add_expr, mem_ref, acc_type = TREE_TYPE (*expr);
>!   unsigned HOST_WIDE_INT misalign;
>!   unsigned align;
>! 
>!   /* Ensure the memory reference carries the minimum alignment
>!  requirement for the data type.  See PR58041.  */
>!   get_object_alignment_1 (*expr, &align, &misalign);
>!   if (misalign != 0)
>! align = (misalign & -misalign);
>!   if (align < TYPE_ALIGN (acc_type))
>! acc_type = build_aligned_type (acc_type, align);
>! 
>!   add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE
>(c->base_expr),
>!c->base_expr, c->stride);
>!   mem_ref = fold_build2 (MEM_REF, acc_type, add_expr,
>!   double_int_to_tree (c->cand_type, c->index));
>! 
>/* Gimplify the base addressing expression for the new MEM_REF tree. 
>*/
>gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt);
>TREE_OPERAND (mem_ref, 0)
>*** /dev/null  Fri Aug  2 16:49:33 2013
>--- gcc/testsuite/gcc.dg/torture/pr58041.c Mon Aug  5 18:56:10 2013
>***
>*** 0 
>--- 1,35 
>+ /* { dg-do run } */
>+ 
>+ typedef long long V
>+   __attribute__ ((vector_size (2 * sizeof (long long)), may_alias));
>+ 
>+ typedef struct S { V v; } P __attribute__((aligned (1)));
>+ 
>+ struct s
>+ {
>+   char u;
>+   V v[2];
>+ } __attribute__((packed,aligned(1)));
>+ 
>+ __attribute__((noinline, noclone))
>+ long long foo(struct s *x, int y, V z)
>+ {
>+   V a = x->v[y];
>+   x->v[y] = z;
>+   return a[1];
>+ }
>+ 
>+ struct s a = {0,{0,0}};
>+ int main()
>+ {
>+   V v1 = {0,1};
>+   V v2 = {0,2};
>+ 
>+   if (foo(&a,0,v1) != 0)
>+ __builtin_abort();
>+   if (foo(&a,0,v2) != 1)
>+ __builtin_abort();
>+   if (foo(&a,1,v1) != 0)
>+ __builtin_abort();
>+   return 0;
>+ }
>*** /dev/null  Fri Aug  2 16:49:33 2013
>--- gcc/testsuite/gcc.target/arm/pr58041.c Mon Aug  5 19:02:04 2013
>***
>*** 0 
>--- 1,30 
>+ /* { dg-do compile } */
>+ /* { dg-options "-Os -mno-unaligned-access" } */
>+ /* { dg-final { scan-assembler "ldrb" } } */
>+ /* { dg-final { scan-assembler "strb" } } */
>+ 
>+ struct s
>+ {
>+   char u;
>+   long long v[2];
>+ } __attribute__((packed,aligned(1)));
>+ 
>+ __attribute__((noinline, noclone))
>+ long long foo(struct s *x, int y, long long z)
>+ {
>+   long long a = x->v[y];
>+   x->v[y] = z;
>+   return a;
>+ }
>+ 
>+ struct s a = {0,{0,0}};
>+ int main()
>+ {
>+   if (foo(&a,0,1) != 0)
>+ __builtin_abort();
>+   if (foo(&a,0,2) != 1)
>+ __builtin_abort();
>+   if (foo(&a,1,1) != 0)
>+ __builtin_abort();
>+   return 0;
>+ }




Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Richard Sandiford
Mike Stump  writes:
> On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:
>> On 13/7/15 1:43 AM, Diego Novillo wrote:
>>> Could you please repost the patch with its description?  This thread
>>> is sufficiently old and noisy that I'm not even sure what the patch
>>> does nor why.
>> 
>> Taking the same example in my first post:
>
>> Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
>> recognize the "PIC-reg + " form of load as a weak symbol; it
>> returns 'true' immediately after seeing the pic-reg indexing, and does
>> not test the wrapped symbol for DECL_WEAK.
>
> So, I can't help but think that others would say that looking into an
> unspec is by nature, the wrong way to do it, unless that code is in the
> port.

Yeah.  I didn't want to reply again for fear of prolonging the thread
even more, but that's why I'd suggested removing:

  /* Handle PIC references.  */
  if (XEXP (x, 0) == pic_offset_table_rtx
   && CONSTANT_P (XEXP (x, 1)))
return true;

altogether and using targetm.delegitimize_address instead.  I think that's
our current interface for dealing with this kind of thing.

Thanks,
Richard


Re: New option to do fine grain control [on|off] of micro-arch specific features : -mtune-ctrl=....

2013-08-05 Thread H.J. Lu
On Mon, Aug 5, 2013 at 9:33 AM, Xinliang David Li  wrote:
> On Mon, Aug 5, 2013 at 8:26 AM, H.J. Lu  wrote:
>> On Sun, Aug 4, 2013 at 9:23 AM, Xinliang David Li  wrote:
>>> On Sun, Aug 4, 2013 at 4:40 AM, Richard Biener
>>>  wrote:
 Xinliang David Li  wrote:
>Hi, GCC/i386 currently has about 73 boolean parameters/knobs (defined
>in ix86_tune_features[], indexed by ix86_tune_indices) to perform
>micro-arch specific performance tuning. However such settings are hard
>coded (fixed with a given -mtune setting) and is very hard to do
>performance experiment.
>
>The attached patch fixes the problem. The patch introduces a new
>option -mtune-ctrl=. Its parameter is a comma separated list of
>feature names to turn on associated features. Feature name can be
>prefixed by ^ to do the opposite. For instance,
>
>  -mtune-ctrl=prologue_using_move,epilogue_using_move,^pad_returns
>
>tells the compiler to use move instructions in prologue/epilogue
>(instead of push/pop), and *not* pad return instructions.
>
>To facilitate the change, the feature tuning enums defined in i386.h
>are moved to a new file x86-tune.def and this file can be used to
>generate both the enums and names of the features.
>
>
>Ok for trunk?
>

 The patch fails to add documentation. And I am nervous about testing 
 coverage - is this considered a development option only or are random 
 combinations expected to work in all situations? I expect not, thus this 
 looks like a dangerous option?

>>>
>>> This option is intended to be used by developers -- otherwise we will
>>> have to document all possible feature knobs. I saw a couple of
>>> existing options in i386.opt marked as 'Undocumented' -- is this mark
>>> used for case like this?   Since this option is for experimental
>>> purpose, user certainly can shoot their foot with it :)
>>>
>>> If there is support of target specific --params which takes strings as
>>> args, it might be a better choice to use that.
>>>
>>
>> I have a similar patch to turn on/off each feature and it is very
>> useful to fine tune x86 backend.  But mine is not automated.
>> Anothing I found useful is a command line switch to turn off all
>> features, like -mno-default.
>
>
> Turn off all features or just toggle the features? What is the use
> case for -mno-default?
>

-mno-default tunes off all features.  To turn on only features: f1,
f2, f3, --,fN.
we can do

-mno-default -mtune-ctrl=f1,f2,,..,fN

We don't need to check if other features are off since they are
turned off by -mno-default.


-- 
H.J.


[PATCH, PR 58041] Make gimple-ssa-strenght-reduction.c create MEM_REFs with alignment information

2013-08-05 Thread Martin Jambor
Hi,

PR 58041 is a misalignment bug caused by replace_ref in
gimple-ssa-strength-reduction.c because it does not make sure that the
MEM_REFs it creates has the proper alignment encoded in them.

I'd like to fix this with the patch below, which basically does the
same thing SRA does, it is only slightly simpler because we are
replacing an access to memory with an access to the exact same memory,
so we don't have to bother computing offset alignments.

I was wondering whether it would make sense to put some common part of
the code to a function and call it from replace_ref and SRA but such a
function would probably only contain the align < TYPE_ALIGN check and
the fold_build_2 call so I was not sure it was worth it.  However, I
may add it as a followup, it may make future users more aware of the
need to take care of alignment.

I have bootstrapped and tested the patch below on on x86_64-linux,
Bill Schmidt did the same on powerpc64-unknown-linux-gnu and it also
got some testing on ARM.  OK for trunk (and I think also the 4.8
branch needs it)?

Thanks,

Martin


2013-08-05  Martin Jambor  

PR middle-end/58041
* gimple-ssa-strength-reduction.c (replace_ref): Make sure built
MEM_REF has proper alignment information.

testsuite/
* gcc.dg/torture/pr58041.c: New test.
* gcc.target/arm/pr58041.c: Likewise.

*** /tmp/5wmq7D_gimple-ssa-strength-reduction.c Mon Aug  5 19:05:06 2013
--- gcc/gimple-ssa-strength-reduction.c Mon Aug  5 18:56:10 2013
*** dump_incr_vec (void)
*** 1728,1738 
  static void
  replace_ref (tree *expr, slsr_cand_t c)
  {
!   tree add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (c->base_expr),
!  c->base_expr, c->stride);
!   tree mem_ref = fold_build2 (MEM_REF, TREE_TYPE (*expr), add_expr,
! double_int_to_tree (c->cand_type, c->index));
!   
/* Gimplify the base addressing expression for the new MEM_REF tree.  */
gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt);
TREE_OPERAND (mem_ref, 0)
--- 1728,1750 
  static void
  replace_ref (tree *expr, slsr_cand_t c)
  {
!   tree add_expr, mem_ref, acc_type = TREE_TYPE (*expr);
!   unsigned HOST_WIDE_INT misalign;
!   unsigned align;
! 
!   /* Ensure the memory reference carries the minimum alignment
!  requirement for the data type.  See PR58041.  */
!   get_object_alignment_1 (*expr, &align, &misalign);
!   if (misalign != 0)
! align = (misalign & -misalign);
!   if (align < TYPE_ALIGN (acc_type))
! acc_type = build_aligned_type (acc_type, align);
! 
!   add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (c->base_expr),
! c->base_expr, c->stride);
!   mem_ref = fold_build2 (MEM_REF, acc_type, add_expr,
!double_int_to_tree (c->cand_type, c->index));
! 
/* Gimplify the base addressing expression for the new MEM_REF tree.  */
gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt);
TREE_OPERAND (mem_ref, 0)
*** /dev/null   Fri Aug  2 16:49:33 2013
--- gcc/testsuite/gcc.dg/torture/pr58041.c  Mon Aug  5 18:56:10 2013
***
*** 0 
--- 1,35 
+ /* { dg-do run } */
+ 
+ typedef long long V
+   __attribute__ ((vector_size (2 * sizeof (long long)), may_alias));
+ 
+ typedef struct S { V v; } P __attribute__((aligned (1)));
+ 
+ struct s
+ {
+   char u;
+   V v[2];
+ } __attribute__((packed,aligned(1)));
+ 
+ __attribute__((noinline, noclone))
+ long long foo(struct s *x, int y, V z)
+ {
+   V a = x->v[y];
+   x->v[y] = z;
+   return a[1];
+ }
+ 
+ struct s a = {0,{0,0}};
+ int main()
+ {
+   V v1 = {0,1};
+   V v2 = {0,2};
+ 
+   if (foo(&a,0,v1) != 0)
+ __builtin_abort();
+   if (foo(&a,0,v2) != 1)
+ __builtin_abort();
+   if (foo(&a,1,v1) != 0)
+ __builtin_abort();
+   return 0;
+ }
*** /dev/null   Fri Aug  2 16:49:33 2013
--- gcc/testsuite/gcc.target/arm/pr58041.c  Mon Aug  5 19:02:04 2013
***
*** 0 
--- 1,30 
+ /* { dg-do compile } */
+ /* { dg-options "-Os -mno-unaligned-access" } */
+ /* { dg-final { scan-assembler "ldrb" } } */
+ /* { dg-final { scan-assembler "strb" } } */
+ 
+ struct s
+ {
+   char u;
+   long long v[2];
+ } __attribute__((packed,aligned(1)));
+ 
+ __attribute__((noinline, noclone))
+ long long foo(struct s *x, int y, long long z)
+ {
+   long long a = x->v[y];
+   x->v[y] = z;
+   return a;
+ }
+ 
+ struct s a = {0,{0,0}};
+ int main()
+ {
+   if (foo(&a,0,1) != 0)
+ __builtin_abort();
+   if (foo(&a,0,2) != 1)
+ __builtin_abort();
+   if (foo(&a,1,1) != 0)
+ __builtin_abort();
+   return 0;
+ }


Re: msp430 port

2013-08-05 Thread DJ Delorie

comma comma comma comma comma comma

got it ;-)


Re: Updated patch (was Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed)

2013-08-05 Thread David Malcolm
On Mon, 2013-08-05 at 06:59 -1000, Richard Henderson wrote:
> On 08/05/2013 05:18 AM, David Malcolm wrote:
> > So I *think* the most efficient traversal is to do this first (with a
> > suitable comment):
> > 
> >  for (int i = passes_by_id_size ; i > 0; )
> >::gt_ggc_mx (passes_by_id[--i]);
> > 
> > That ought to visit all of the passes without triggering recursion
> > (unless someone does something weird to the pass structure in a plugin).
> 
> Reasonable.
> 
> > I'm nervous about omitting the traversal for other pointers the class
> > has (though I *think* the passes by id array captures it all) so for
> > completeness I'd prefer to then to also do it for all of:
> > 
> >  ::gt_ggc_mx (all_passes);
> >  ::gt_ggc_mx (all_small_ipa_passes);
> >  ::gt_ggc_mx (all_lowering_passes);
> >  ::gt_ggc_mx (all_regular_ipa_passes);
> >  ::gt_ggc_mx (all_lto_gen_passes);
> >  ::gt_ggc_mx (all_late_ipa_passes);
> > 
> > #define INSERT_PASSES_AFTER(PASS)
> > #define PUSH_INSERT_PASSES_WITHIN(PASS)
> > #define POP_INSERT_PASSES()
> > #define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
> > #define TERMINATE_PASS_LIST()
> > 
> > #include "pass-instances.def"
> > 
> > #undef INSERT_PASSES_AFTER
> > #undef PUSH_INSERT_PASSES_WITHIN
> > #undef POP_INSERT_PASSES
> > #undef NEXT_PASS
> > #undef TERMINATE_PASS_LIST
> > 
> > Is it standard in handwritten GC code to omit traversals based on this
> > higher-level knowledge?  Presumably it warrants a source comment?
> > (having spent too much time lately debugging PCH issues I'm nervous
> > about trying to optimize this).
> 
> It's something that we should think about when doing any kind of GC.
> The deep recursion has bitten us before, and lead to the chain_next
> and chain_prev annotations for GTY.

Note to self: the chain_next and chain_prev options are documented at:
http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html

BTW, this issue seems very reminiscent to me of an implementation detail
within the CPython interpreter called the "trashcan".  CPython uses
reference-counting, and if you have a deep chain of references e.g.:
  # make a very deep chain of list-of-lists:
  foo = []
  for i in range(depth):
 foo = [foo]
  # at this point we have
  # foo =  \
  #    for some depth

  # try to overflow the stack during deallocation:
  del foo

...you would have a deeply nested chain of decrefs, each triggering
deleting of the object, and hence a potential stack overflow.

The way CPython avoids such deep reference chains from killing the
process with a stack overflow is a pair of macros used in the
deallocator for container types, which track the depth of the traversal,
and when it reaches a certain threshold, start appending objects to a
singly-linked list of deferred deallocations (using a spare pointer in
the objects that's safe to use during finalization).  Hence the
recursion becomes an iteration when a certain stack depth is reached,
and diabolical reference graphs can't blow the stack (modulo bugs in
3rd-party container code, of course).

The equivalent for GC would be for mx routines to change from:

  if (ggc_test_and_set_mark (p))
gt_ggc_mx (p);

to:

  if (ggc_test_and_set_mark (p))
add_to_worklist (p, marking_cb);

I suspect that this technique can't be used in GGC since it would
require (a) having a spare pointer per-object and (b) tracking the type
of the marking callback, which would be a jump through a function ptr
where there wasn't one before, and thus unacceptable in the general
case.

> > AIUI we could do similar optimizations in the PCH object-noting hook,
> > but can't do similar optimizations in the PCH *op* hook, since every
> > field needs to reconstructed when reading the data back from disk.
> 
> Correct.

I'll update the patch with the optimizations you suggest (and the
reverse-order marking I mentioned above).

Thanks
Dave



Re: [PATCH] Correctly validate free registers for peep2

2013-08-05 Thread Richard Henderson
On 08/04/2013 11:40 PM, Richard Earnshaw wrote:
>   *recog.c (peep2_find_free_register): Validate all regs in a
>   multi-reg mode.
> 
> Bootstrapped on x86_64.
> 
> Ok for trunk and 4.8?  (4.7 is also affected, but I don't know of any
> back-end relies on this at that point).

Ok.


r~


Re: Updated patch (was Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed)

2013-08-05 Thread Richard Henderson
On 08/05/2013 05:18 AM, David Malcolm wrote:
> So I *think* the most efficient traversal is to do this first (with a
> suitable comment):
> 
>  for (int i = passes_by_id_size ; i > 0; )
>::gt_ggc_mx (passes_by_id[--i]);
> 
> That ought to visit all of the passes without triggering recursion
> (unless someone does something weird to the pass structure in a plugin).

Reasonable.

> I'm nervous about omitting the traversal for other pointers the class
> has (though I *think* the passes by id array captures it all) so for
> completeness I'd prefer to then to also do it for all of:
> 
>  ::gt_ggc_mx (all_passes);
>  ::gt_ggc_mx (all_small_ipa_passes);
>  ::gt_ggc_mx (all_lowering_passes);
>  ::gt_ggc_mx (all_regular_ipa_passes);
>  ::gt_ggc_mx (all_lto_gen_passes);
>  ::gt_ggc_mx (all_late_ipa_passes);
> 
> #define INSERT_PASSES_AFTER(PASS)
> #define PUSH_INSERT_PASSES_WITHIN(PASS)
> #define POP_INSERT_PASSES()
> #define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
> #define TERMINATE_PASS_LIST()
> 
> #include "pass-instances.def"
> 
> #undef INSERT_PASSES_AFTER
> #undef PUSH_INSERT_PASSES_WITHIN
> #undef POP_INSERT_PASSES
> #undef NEXT_PASS
> #undef TERMINATE_PASS_LIST
> 
> Is it standard in handwritten GC code to omit traversals based on this
> higher-level knowledge?  Presumably it warrants a source comment?
> (having spent too much time lately debugging PCH issues I'm nervous
> about trying to optimize this).

It's something that we should think about when doing any kind of GC.
The deep recursion has bitten us before, and lead to the chain_next
and chain_prev annotations for GTY.

> AIUI we could do similar optimizations in the PCH object-noting hook,
> but can't do similar optimizations in the PCH *op* hook, since every
> field needs to reconstructed when reading the data back from disk.

Correct.


r~



Re: [PATCH, Fortran, PR 57987] Do not call cgraph_finalize_function multiple times on finalizers

2013-08-05 Thread Martin Jambor
Hi,

On Sat, Aug 03, 2013 at 01:12:41PM +0200, Dominique Dhumieres wrote:
> Hi Martin,
> 
> I have applied the patch on top of r201441 and I still get the warning for
> gcc/testsuite/gfortran.dg/class_48.f90 with -m32 -O(2|s):
> 
> /opt/gcc/work/gcc/testsuite/gfortran.dg/class_48.f90: In function 
> '__final_test2_T.2136.constprop.0':
> /opt/gcc/work/gcc/testsuite/gfortran.dg/class_48.f90:39:0: warning: iteration 
> 2147483648 invokes undefined behavior [-Waggressive-loop-optimizations]
>  class(t), allocatable :: a
>  ^
> /opt/gcc/work/gcc/testsuite/gfortran.dg/class_48.f90:39:0: note: containing 
> loop
> 
> The test gcc/testsuite/gfortran.dg/pr57987.f90 passes.

Well, that is because this patch is a fix for PR 57987 (Fortran
finalizers considered extern-inline by middle-end) and not for PR
57904 (the incorrect warning), even though the testcase is the same.

As I pointed out, the warning happens in dead code (that is dead
thanks to IPA-CP but not removed yet when the warning is emitted).  I
have not given much thought to fixing this problem and worked on other
things.  IIRC I dug out of history that Jakub added the warning code
recently and so I CCed him.  I suppose he will have a look at this
once he re-appears.

Thanks for testing,

Martin


Re: New parameters to control stringop expansion libcall strategy

2013-08-05 Thread Xinliang David Li
thanks. Updated patch attached.

David

On Mon, Aug 5, 2013 at 3:57 AM, Michael V. Zolotukhin
 wrote:
> Hi,
> This is a really convenient option, thanks for working on it.
> I can't approve it as I'm not a maintainer, but it looks ok to me,
> except fot a small nitpicking: afair, comments should end with
> dot-space-space.
>
> Michael
>
> On 04 Aug 20:01, Xinliang David Li wrote:
>> The attached is a new patch implementing the stringop inline strategy
>> control using two new -m options:
>>
>> -mmemcpy-strategy=
>> -mmemset-strategy=
>>
>> See changes in doc/invoke.texi for description of the new options. Example:
>>   
>> -mmemcpy-strategy=rep_8byte:64:unaligned,unrolled_loop:2048:unaligned,libcall:-1:unaligned
>>
>> tells compiler to inline memcpy using rep_8byte when the size is no
>> larger than 64 byte, using unrolled_loop when size is no larger than
>> 2048, and for size > 2048, using library call. In all cases,
>> destination alignment adjustment is not done.
>>
>> Tested on x86-64/linux. Ok for trunk?
>>
>> thanks,
>>
>> David
>>
>> 2013-08-02  Xinliang David Li  
>>
>> * config/i386/stringop.def: New file.
>> * config/i386/stringop.opt: New file.
>> * config/i386/i386-opts.h: Include stringopt.def.
>> * config/i386/i386.opt: Include stringopt.opt.
>> * config/i386/i386.c (ix86_option_override_internal):
>> Override default size based stringop inline strategies
>> with options.
>> * config/i386/i386.c (ix86_parse_stringop_strategy_string):
>> New function.
>>
>> 2013-08-04  Xinliang David Li  
>>
>> * testsuite/gcc.target/i386/memcpy-strategy-1.c: New test.
>> * testsuite/gcc.target/i386/memcpy-strategy-2.c: Ditto.
>> * testsuite/gcc.target/i386/memset-strategy-1.c: Ditto.
>> * testsuite/gcc.target/i386/memcpy-strategy-3.c: Ditto.
>>
>>
>>
>>
>> On Fri, Aug 2, 2013 at 9:21 PM, Xinliang David Li  wrote:
>> > On x86_64, when the expected size of memcpy/memset is known (e.g, with
>> > FDO), libcall strategy is used with the size is > 8192. This value is
>> > hard coded, which makes it hard to do performance tuning. This patch
>> > adds two new parameters to do that. Potential usage includes
>> > per-application libcall strategy min-size tuning based on summary data
>> > with FDO (e.g, instruction workset size).
>> >
>> > Bootstrap and tested on x86_64/linux. Ok for trunk?
>> >
>> > thanks,
>> >
>> > David
>> >
>> >
>> > 2013-08-02  Xinliang David Li  
>> >
>> > * params.def: New parameters.
>> > * config/i386/i386.c (ix86_option_override_internal):
>> > Override default libcall size limit with parameters.
>
>> Index: config/i386/stringop.def
>> ===
>> --- config/i386/stringop.def  (revision 0)
>> +++ config/i386/stringop.def  (revision 0)
>> @@ -0,0 +1,42 @@
>> +/* Definitions for option handling for IA-32.
>> +   Copyright (C) 2013 Free Software Foundation, Inc.
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify
>> +it under the terms of the GNU General Public License as published by
>> +the Free Software Foundation; either version 3, or (at your option)
>> +any later version.
>> +
>> +GCC is distributed in the hope that it will be useful,
>> +but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +GNU General Public License for more details.
>> +
>> +Under Section 7 of GPL version 3, you are granted additional
>> +permissions described in the GCC Runtime Library Exception, version
>> +3.1, as published by the Free Software Foundation.
>> +
>> +You should have received a copy of the GNU General Public License and
>> +a copy of the GCC Runtime Library Exception along with this program;
>> +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>> +.  */
>> +
>> +DEF_ENUM
>> +DEF_ALG (no_stringop, no_stringop)
>> +DEF_ENUM
>> +DEF_ALG (libcall, libcall)
>> +DEF_ENUM
>> +DEF_ALG (rep_prefix_1_byte, rep_byte)
>> +DEF_ENUM
>> +DEF_ALG (rep_prefix_4_byte, rep_4byte)
>> +DEF_ENUM
>> +DEF_ALG (rep_prefix_8_byte, rep_8byte)
>> +DEF_ENUM
>> +DEF_ALG (loop_1_byte, byte_loop)
>> +DEF_ENUM
>> +DEF_ALG (loop, loop)
>> +DEF_ENUM
>> +DEF_ALG (unrolled_loop, unrolled_loop)
>> +DEF_ENUM
>> +DEF_ALG (vector_loop, vector_loop)
>> Index: config/i386/i386.opt
>> ===
>> --- config/i386/i386.opt  (revision 201458)
>> +++ config/i386/i386.opt  (working copy)
>> @@ -316,6 +316,14 @@ mstack-arg-probe
>>  Target Report Mask(STACK_PROBE) Save
>>  Enable stack probing
>>
>> +mmemcpy-strategy=
>> +Target RejectNegative Joined Var(ix86_tune_memcpy_strategy)
>> +Specify memcpy expansion strategy when expected size is known
>> +
>> +mmemset-strategy=
>> +Target RejectNegative Joined Var(ix86_tune_memset

Re: New option to do fine grain control [on|off] of micro-arch specific features : -mtune-ctrl=....

2013-08-05 Thread Xinliang David Li
On Mon, Aug 5, 2013 at 8:26 AM, H.J. Lu  wrote:
> On Sun, Aug 4, 2013 at 9:23 AM, Xinliang David Li  wrote:
>> On Sun, Aug 4, 2013 at 4:40 AM, Richard Biener
>>  wrote:
>>> Xinliang David Li  wrote:
Hi, GCC/i386 currently has about 73 boolean parameters/knobs (defined
in ix86_tune_features[], indexed by ix86_tune_indices) to perform
micro-arch specific performance tuning. However such settings are hard
coded (fixed with a given -mtune setting) and is very hard to do
performance experiment.

The attached patch fixes the problem. The patch introduces a new
option -mtune-ctrl=. Its parameter is a comma separated list of
feature names to turn on associated features. Feature name can be
prefixed by ^ to do the opposite. For instance,

  -mtune-ctrl=prologue_using_move,epilogue_using_move,^pad_returns

tells the compiler to use move instructions in prologue/epilogue
(instead of push/pop), and *not* pad return instructions.

To facilitate the change, the feature tuning enums defined in i386.h
are moved to a new file x86-tune.def and this file can be used to
generate both the enums and names of the features.


Ok for trunk?

>>>
>>> The patch fails to add documentation. And I am nervous about testing 
>>> coverage - is this considered a development option only or are random 
>>> combinations expected to work in all situations? I expect not, thus this 
>>> looks like a dangerous option?
>>>
>>
>> This option is intended to be used by developers -- otherwise we will
>> have to document all possible feature knobs. I saw a couple of
>> existing options in i386.opt marked as 'Undocumented' -- is this mark
>> used for case like this?   Since this option is for experimental
>> purpose, user certainly can shoot their foot with it :)
>>
>> If there is support of target specific --params which takes strings as
>> args, it might be a better choice to use that.
>>
>
> I have a similar patch to turn on/off each feature and it is very
> useful to fine tune x86 backend.  But mine is not automated.
> Anothing I found useful is a command line switch to turn off all
> features, like -mno-default.


Turn off all features or just toggle the features? What is the use
case for -mno-default?

thanks,

David

 It can be used to turn on a group of
> features without checking which features are on/off by default.
>
>
> --
> H.J.


Re: New option to do fine grain control [on|off] of micro-arch specific features : -mtune-ctrl=....

2013-08-05 Thread H.J. Lu
On Sun, Aug 4, 2013 at 9:23 AM, Xinliang David Li  wrote:
> On Sun, Aug 4, 2013 at 4:40 AM, Richard Biener
>  wrote:
>> Xinliang David Li  wrote:
>>>Hi, GCC/i386 currently has about 73 boolean parameters/knobs (defined
>>>in ix86_tune_features[], indexed by ix86_tune_indices) to perform
>>>micro-arch specific performance tuning. However such settings are hard
>>>coded (fixed with a given -mtune setting) and is very hard to do
>>>performance experiment.
>>>
>>>The attached patch fixes the problem. The patch introduces a new
>>>option -mtune-ctrl=. Its parameter is a comma separated list of
>>>feature names to turn on associated features. Feature name can be
>>>prefixed by ^ to do the opposite. For instance,
>>>
>>>  -mtune-ctrl=prologue_using_move,epilogue_using_move,^pad_returns
>>>
>>>tells the compiler to use move instructions in prologue/epilogue
>>>(instead of push/pop), and *not* pad return instructions.
>>>
>>>To facilitate the change, the feature tuning enums defined in i386.h
>>>are moved to a new file x86-tune.def and this file can be used to
>>>generate both the enums and names of the features.
>>>
>>>
>>>Ok for trunk?
>>>
>>
>> The patch fails to add documentation. And I am nervous about testing 
>> coverage - is this considered a development option only or are random 
>> combinations expected to work in all situations? I expect not, thus this 
>> looks like a dangerous option?
>>
>
> This option is intended to be used by developers -- otherwise we will
> have to document all possible feature knobs. I saw a couple of
> existing options in i386.opt marked as 'Undocumented' -- is this mark
> used for case like this?   Since this option is for experimental
> purpose, user certainly can shoot their foot with it :)
>
> If there is support of target specific --params which takes strings as
> args, it might be a better choice to use that.
>

I have a similar patch to turn on/off each feature and it is very
useful to fine tune x86 backend.  But mine is not automated.
Anothing I found useful is a command line switch to turn off all
features, like -mno-default. It can be used to turn on a group of
features without checking which features are on/off by default.


--
H.J.


Re: Updated patch (was Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed)

2013-08-05 Thread David Malcolm
On Sat, 2013-08-03 at 08:39 -1000, Richard Henderson wrote:
> On 08/02/2013 02:48 PM, David Malcolm wrote:
> > +pass_manager::gt_ggc_mx ()
> > +{
> > +  ::gt_ggc_mx (all_passes);
> > +  ::gt_ggc_mx (all_small_ipa_passes);
> > +  ::gt_ggc_mx (all_lowering_passes);
> > +  ::gt_ggc_mx (all_regular_ipa_passes);
> > +  ::gt_ggc_mx (all_lto_gen_passes);
> > +  ::gt_ggc_mx (all_late_ipa_passes);
> > +
> > +  for (int i = 0; i < passes_by_id_size; i++)
> > +::gt_ggc_mx (passes_by_id[i]);
> > +
> > +#define INSERT_PASSES_AFTER(PASS)
> > +#define PUSH_INSERT_PASSES_WITHIN(PASS)
> > +#define POP_INSERT_PASSES()
> > +#define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
> > +#define TERMINATE_PASS_LIST()
> > +
> > +#include "pass-instances.def"
> > +
> > +#undef INSERT_PASSES_AFTER
> > +#undef PUSH_INSERT_PASSES_WITHIN
> > +#undef POP_INSERT_PASSES
> > +#undef NEXT_PASS
> > +#undef TERMINATE_PASS_LIST
> > +
> > +}
> 
> You're marking all passes, and also walking through the chain of sub/next
> within the passes themselves?  That seems unnecessary.  Either the passes
> are reachable via sub/next or they aren't.
> 
> Alternately, don't walk the sub/next fields and instead only walk all of
> the passes explicitly, here.  That avoids some of the recursion in the
> opt_pass marking, and keeps the call chain flatter.

Each _mx call is implemented like this:
  if (ggc_test_and_set_mark (p))
p->gt_ggc_mx ();
i.e. each pass only recurses the first time it is seen.

So if the goal is to avoid a deep traversal of the call chain, then
walking the passes_by_id array *backwards* ought to walk the pass tree
from the leaf passes first, eventually reaching the trunk passes, and
hence none of the calls should recurse, given that at each point in the
iteration pass->sub and pass->next will already been marked.

So I *think* the most efficient traversal is to do this first (with a
suitable comment):

 for (int i = passes_by_id_size ; i > 0; )
   ::gt_ggc_mx (passes_by_id[--i]);

That ought to visit all of the passes without triggering recursion
(unless someone does something weird to the pass structure in a plugin).

I'm nervous about omitting the traversal for other pointers the class
has (though I *think* the passes by id array captures it all) so for
completeness I'd prefer to then to also do it for all of:

 ::gt_ggc_mx (all_passes);
 ::gt_ggc_mx (all_small_ipa_passes);
 ::gt_ggc_mx (all_lowering_passes);
 ::gt_ggc_mx (all_regular_ipa_passes);
 ::gt_ggc_mx (all_lto_gen_passes);
 ::gt_ggc_mx (all_late_ipa_passes);

#define INSERT_PASSES_AFTER(PASS)
#define PUSH_INSERT_PASSES_WITHIN(PASS)
#define POP_INSERT_PASSES()
#define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
#define TERMINATE_PASS_LIST()

#include "pass-instances.def"

#undef INSERT_PASSES_AFTER
#undef PUSH_INSERT_PASSES_WITHIN
#undef POP_INSERT_PASSES
#undef NEXT_PASS
#undef TERMINATE_PASS_LIST

Is it standard in handwritten GC code to omit traversals based on this
higher-level knowledge?  Presumably it warrants a source comment?
(having spent too much time lately debugging PCH issues I'm nervous
about trying to optimize this).

AIUI we could do similar optimizations in the PCH object-noting hook,
but can't do similar optimizations in the PCH *op* hook, since every
field needs to reconstructed when reading the data back from disk.

Thanks
Dave




[ubsan] Properly create const char type

2013-08-05 Thread Marek Polacek
I was creating the const char type in a wrong way.  I should've used
build_qualified_type, otherwise we'd ICE in the C++ FE later on due to
mismatched TYPE_CANONICALs...

Tested x86_64-pc-linux-gnu, applying to ubsan branch.

2013-08-05  Marek Polacek  

* ubsan.c (ubsan_source_location_type): Properly create
const char type using build_qualified_type.

* c-c++-common/ubsan/const-char-1.c: New test.

--- gcc/ubsan.c.mp  2013-08-05 16:15:03.769489097 +0200
+++ gcc/ubsan.c 2013-08-05 16:15:12.352528297 +0200
@@ -226,8 +226,8 @@ ubsan_source_location_type (void)
   static const char *field_names[3]
 = { "__filename", "__line", "__column" };
   tree fields[3], ret;
-  tree const_char_type = char_type_node;
-  TYPE_READONLY (const_char_type) = 1;
+  tree const_char_type = build_qualified_type (char_type_node,
+  TYPE_QUAL_CONST);
 
   ret = make_node (RECORD_TYPE);
   for (int i = 0; i < 3; i++)
--- gcc/testsuite/c-c++-common/ubsan/const-char-1.c.mp  2013-08-05 
16:15:48.733686613 +0200
+++ gcc/testsuite/c-c++-common/ubsan/const-char-1.c 2013-08-05 
16:18:06.384287418 +0200
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=shift" } */
+
+void
+foo (void)
+{
+  int y = 1 << 2;
+  __builtin_printf ("%d\n", y);
+}

Marek


Fix wrong code issues on x86 and LTO

2013-08-05 Thread Jan Hubicka
Hi,
our handling of local flag (that define where register passing convention will 
be used)
is somewhat off. First we clear it on all functions that have call across 
partitions
and we also clear it on partial cgraph boundaries.
Fixed both,
Bootstrapped/regtested ppc64-linux, comitted.
PR lto/57602
* cgraph.c (verify_cgraph_node): Accept local flags from other 
partitions.
* ipa.c (symtab_remove_unreachable_nodes): Do not clear local flag.
(function_and_variable_visibility): Likewise.
* trans-mem.c (ipa_tm_create_version): TM versions are not local.
Index: cgraph.c
===
*** cgraph.c(revision 201483)
--- cgraph.c(working copy)
*** verify_cgraph_node (struct cgraph_node *
*** 2363,2369 
error ("inline clone in same comdat group list");
error_found = true;
  }
!   if (!node->symbol.definition && node->local.local)
  {
error ("local symbols must be defined");
error_found = true;
--- 2363,2369 
error ("inline clone in same comdat group list");
error_found = true;
  }
!   if (!node->symbol.definition && !node->symbol.in_other_partition && 
node->local.local)
  {
error ("local symbols must be defined");
error_found = true;
Index: ipa.c
===
*** ipa.c   (revision 201483)
--- ipa.c   (working copy)
*** symtab_remove_unreachable_nodes (bool be
*** 376,382 
{
  if (file)
fprintf (file, " %s", cgraph_node_name (node));
! cgraph_reset_node (node);
  changed = true;
}
}
--- 376,390 
{
  if (file)
fprintf (file, " %s", cgraph_node_name (node));
! node->symbol.analyzed = false;
! node->symbol.definition = false;
! node->symbol.cpp_implicit_alias = false;
! node->symbol.alias = false;
! node->symbol.weakref = false;
! if (!node->symbol.in_other_partition)
!   node->local.local = false;
! cgraph_node_remove_callees (node);
! ipa_remove_all_references (&node->symbol.ref_list);
  changed = true;
}
}
*** function_and_variable_visibility (bool w
*** 888,894 
  }
FOR_EACH_DEFINED_FUNCTION (node)
  {
!   node->local.local = cgraph_local_node_p (node);
  
/* If we know that function can not be overwritten by a different 
semantics
 and moreover its section can not be discarded, replace all direct calls
--- 896,902 
  }
FOR_EACH_DEFINED_FUNCTION (node)
  {
!   node->local.local |= cgraph_local_node_p (node);
  
/* If we know that function can not be overwritten by a different 
semantics
 and moreover its section can not be discarded, replace all direct calls
Index: trans-mem.c
===
*** trans-mem.c (revision 201483)
--- trans-mem.c (working copy)
*** ipa_tm_create_version (struct cgraph_nod
*** 4774,4779 
--- 4774,4780 
  DECL_COMDAT_GROUP (new_decl) = tm_mangle (DECL_COMDAT_GROUP (old_decl));
  
new_node = cgraph_copy_node_for_versioning (old_node, new_decl, vNULL, 
NULL);
+   new_node->local.local = false;
new_node->symbol.externally_visible = old_node->symbol.externally_visible;
new_node->lowered = true;
new_node->tm_clone = 1;


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-05 Thread Teresa Johnson
On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka  wrote:
> The patch looks OK to me in general (I can not approve it).
> Still have one question...
>> +
>> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
>> +   post-dominance DIR, for example as a result of edge weight insanities.
>> +   Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
>> +   to BBS_IN_HOT_PARTITION.  */
>> +
>> +static unsigned int
>> +sanitize_dominator_hotness (enum cdi_direction dir, unsigned int 
>> cold_bb_count,
>> +vec *bbs_in_hot_partition)
>> +{
>> +  /* Callers check this.  */
>> +  gcc_checking_assert (cold_bb_count);
>> +
>> +  bool dom_calculated_here = !dom_info_available_p (dir);
>> +
>> +  if (dom_calculated_here)
>> +calculate_dominance_info (dir);
>> +
>> +  /* Keep examining hot bbs while we still have some left to check
>> + and there are remaining cold bbs.  */
>> +  vec hot_bbs_to_check = bbs_in_hot_partition->copy ();
>> +  while (! hot_bbs_to_check.is_empty ()
>> + && cold_bb_count)
>> +{
>> +  basic_block bb = hot_bbs_to_check.pop ();
>> +  basic_block dom_bb = get_immediate_dominator (dir, bb);
>> +
>> +  /* If bb's immediate dominator is also hot (or unpartitioned,
>> + e.g. the entry block) then it is ok. If it is cold, it
>> + needs to be adjusted.  */
>> +  if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
>> +continue;
>
> What will hapepn on
>
> if (t)
>   something
> else
>   something else
> for (i=0;i<100;i++)
>   something else2
>
> I would expect if/something and something else to be cold by profile feedback.
> Your dominator code will bring the if into hot partition but both paths
> of conditional will be cold, so the number of crossings will actually grow.

You are right, this case will not be handled well.

>
> If we want to have at least some path to hot blocks in the hot region, I 
> suspect
> we could walk back from hot regions to entry and keep those in hot regions 
> rather
> than relying on the dominator tree...
> But I am sure such things can be dealt with incrementally.

I am going to fix this and will resend the patch. Rather than look at
the immediate dominator of each hot block, we need to ensure that at
least one pred bb is hot. In your example, if that was a 50-50 branch,
then IMO both preds should be marked hot.

Thanks,
Teresa

>
> Honza



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Chung-Lin Tang
On 13/8/5 下午10:24, Mike Stump wrote:
> On Aug 5, 2013, at 7:15 AM, Chung-Lin Tang  wrote:
>> On 13/8/5 10:06 PM, Mike Stump wrote:
>>> On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:
 On 13/7/15 1:43 AM, Diego Novillo wrote:
> Could you please repost the patch with its description?  This thread
> is sufficiently old and noisy that I'm not even sure what the patch
> does nor why.

 Taking the same example in my first post:

 Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
 recognize the "PIC-reg + " form of load as a weak symbol; it
 returns 'true' immediately after seeing the pic-reg indexing, and does
 not test the wrapped symbol for DECL_WEAK.
>>>
>>> So, I can't help but think that others would say that looking into an 
>>> unspec is by nature, the wrong way to do it, unless that code is in the 
>>> port.
>>>
>>> I think the followup from Bernhard points to a better solution, though the 
>>> wording in the comment was objectionable.  Merely say that the symbol, if 
>>> weak and not defined, is then not local.
>>
>> When I last tested that patch which moves the DECL_WEAK check, the
>> testcases for C++ TLS wrappers fail. I don't remember the fine details,
>> but effectively it filters out the TLS wrappers from being treated
>> locally, causing them to be called through @PLT, and regressing on some
>> tests specifically checking for that…
> 
> Hum…  I wonder if there is a TLS predicate one can mix in to the check that 
> is appropriate.
> 
>> The UNSPEC interpretation here is fairly restricted, FWIW. Earlier talk
>> on this thread also mentioned that maybe specific RTL constructs for
>> reasoning about PIC addresses should be introduced, rather than common
>> idiomatic pattern, though that may be a long shot for now.
> 
> specifying the unspecified, make is specified, and calling it unspec, would 
> seem to be wrong.
> 
> The right approach, long term, is to have address forms all specified and 
> documented and a port merely can use the forms they are interested in.  pic 
> is one of those things that should be bumped up, unspec is kinda silly.

Yes, that's what I meant. e.g. instead of using (const (unspec...)), new
RTL operators for specifying the common forms of relocations (including
those used PIC) should be defined; this will involve changing lots of
backends, so should be aimed in the long term.

Chung-Lin






Re: C++ coding conventions: namespaces, references and getters (was Re: [PATCH 2/2] Introduce beginnings of a pipeline class.)

2013-08-05 Thread Gabriel Dos Reis
On Mon, Aug 5, 2013 at 6:24 AM, Martin Jambor  wrote:
[…]
>> > Note that as per
>> >   http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01365.html
>> > we'll use "pass_manager" rather than "pipeline", so this would look
>> > like:
>> >   pass_manager &get_passes () { gcc_assert (passes_); return *passes_; }
>> >
>> > We were chatting about C++ references on IRC on Friday, and IIRC there
>> > was a strong objection to passing *arguments* that are non-const
>> > references,
>>
>> I hope nobody is objecting to std::swap for example.
>
> I'm not sure how I could object to std::swap or what good would that
> be.  Apart from being the standard, it is also rather exceptional due
> to its nature and its name.

I suspect it is a mistake to think this situation is unique.  I also think
that we can keep going on and a long list of examples, and that might
still be labelled special case.  Reading into a variable is just another
example.

> Despite this special example, I think the
> rule of not passing arguments as non-const references is a good one,
> and it seems I am not alone.  Where I may differ from others is that I
> also think that getter functions should (generally) return pointers or
> const references too, and for similar reasons - the caller may
> unintentionally modify data that the programmers thought were on the
> stack but were in fact elsewhere else and "belonged" to someplace
> else.

But the rule does not solve that problem -- 'const' does not mean this is
on stack, and absence of const does not mean this is on free store.
Being on stack does no necessarily mean don't change, and being on
free store does not mean you can freely change.

As a matter of fact, in the existing code base, we have many objects
on the stack that are passed around by non-const pointer. That is just
fine for the most part.  The notion of const is largely orthogonal to the
notion lifetime (stack of free store.)

These are problems that depend on the specific algorithms being
implemented.  Legislating on const reference or not won't solve
any problem there.

-- Gaby


Re: GDB hooks for debugging GCC

2013-08-05 Thread Tom Tromey
> "David" == David Malcolm  writes:

David> GDB 7.0 onwards supports hooks written in Python to improve the
David> quality-of-life within the debugger.  The best known are the
David> pretty-printing hooks [1], which we already use within libstdc++ for
David> printing better representations of STL containers.

Nice!

A few suggestions.

David>   (note how the rtx_def* is printed inline.  This last one is actually a
David> kludge; all the other pretty-printers work inside gdb, whereas this one
David> injects a call into print-rtl.c into the inferior).

Naughty.

David>   * it hardcoded values in a few places rather than correctly looking up
David> enums

If you have a new-enough gdb (I don't recall the exact version -- I can
look if you want, but recall that gcc changes mean that gcc developers
generally have to use a very recent gdb) you can use
gdb.types.make_enum_dict to get this very easily.

David> You may see a message from gdb of the form:
David>   cc1-gdb.py auto-loading has been declined by your `auto-load safe-path'
David> as a protection against untrustworthy python scripts.  See
David>   
http://sourceware.org/gdb/onlinedocs/gdb/Auto_002dloading-safe-path.html

I think you could set up the safe-path in the gcc .gdbinit.

David> Note that you can suppress pretty-printers using /r (for "raw"):
David>   (gdb) p /r pass
David>   $3 = (opt_pass *) 0x188b600
David> and dereference the pointer in the normal way:
David>   (gdb) p *pass
David>   $4 = {type = RTL_PASS, name = 0x120a312 "expand",
David>   [etc, ...snipped...]

I just wanted to clarify here that you can "p *pass" *without* first
using "p/r".  Pretty-printing applies just to printing -- it does not
affect what is in the value history.  The values there still have the
correct type and such.

David> def pretty_printer_lookup(gdbval):
[...]

David> def register (obj):
David> if obj is None:
David> obj = gdb

David> # Wire up the pretty-printer
David> obj.pretty_printers.append(pretty_printer_lookup)

It's better to use the gdb.printing facilities now.  These let user
disable pretty-printers if they prefer.  The libstdc++ printers go out
of their way to use gdb.printing only if available; but you can probably
just assume it exists.

David> print('Successfully loaded GDB hooks for GCC')

I wonder whether gdb ought to do this.

Tom


Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Mike Stump
On Aug 5, 2013, at 7:15 AM, Chung-Lin Tang  wrote:
> On 13/8/5 10:06 PM, Mike Stump wrote:
>> On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:
>>> On 13/7/15 1:43 AM, Diego Novillo wrote:
 Could you please repost the patch with its description?  This thread
 is sufficiently old and noisy that I'm not even sure what the patch
 does nor why.
>>> 
>>> Taking the same example in my first post:
>>> 
>>> Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
>>> recognize the "PIC-reg + " form of load as a weak symbol; it
>>> returns 'true' immediately after seeing the pic-reg indexing, and does
>>> not test the wrapped symbol for DECL_WEAK.
>> 
>> So, I can't help but think that others would say that looking into an unspec 
>> is by nature, the wrong way to do it, unless that code is in the port.
>> 
>> I think the followup from Bernhard points to a better solution, though the 
>> wording in the comment was objectionable.  Merely say that the symbol, if 
>> weak and not defined, is then not local.
> 
> When I last tested that patch which moves the DECL_WEAK check, the
> testcases for C++ TLS wrappers fail. I don't remember the fine details,
> but effectively it filters out the TLS wrappers from being treated
> locally, causing them to be called through @PLT, and regressing on some
> tests specifically checking for that…

Hum…  I wonder if there is a TLS predicate one can mix in to the check that is 
appropriate.

> The UNSPEC interpretation here is fairly restricted, FWIW. Earlier talk
> on this thread also mentioned that maybe specific RTL constructs for
> reasoning about PIC addresses should be introduced, rather than common
> idiomatic pattern, though that may be a long shot for now.

specifying the unspecified, make is specified, and calling it unspec, would 
seem to be wrong.

The right approach, long term, is to have address forms all specified and 
documented and a port merely can use the forms they are interested in.  pic is 
one of those things that should be bumped up, unspec is kinda silly.

RE: [PATCH][4.8 backport] Fix PR57735

2013-08-05 Thread Kyrylo Tkachov
> Ping?
> 

Ping^2

 Thanks,
 Kyrill
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Kyrylo Tkachov
> > Sent: 23 July 2013 10:09
> > To: 'Richard Sandiford'
> > Cc: gcc-patches; mi...@it.uu.se; 'Richard Biener'
> > Subject: RE: [PATCH][4.8 backport] Fix PR57735
> >
> > Hi Richard,
> >
> > > Richard Sandiford  writes:
> > > > "Kyrylo Tkachov"  writes:
> > > >> Hi all,
> > > >>
> > > >> The fix for PR57735 is in current trunk (for a different issue I
> > > think), just
> > > >> needs a backport to 4.8.
> > > >> It is r198462 by Richard Sandiford:
> > > >>
> > > >> 2013-04-30 Richard Sandiford 
> > > >>
> > > >>* explow.c (plus_constant): Pass "mode" to
> > immed_double_int_const.
> > > >>Use gen_int_mode rather than GEN_INT.
> > > >>
> > > >> Ok to backport to the 4.8 branch?
> > > >
> > > > Sorry for dropping the ball.  It's already been approved for 4.8,
> > > > and I think I even remembered to test it ready for commit.  I just
> > > > never actually hit commit afterwards :-(
> > >
> > > *sigh*.  Scratch that.  I'd confused it with:
> > >
> > > 2013-05-22  Richard Sandiford  
> > >
> > >   * recog.c (offsettable_address_addr_space_p): Fix calculation of
> > >   address mode.  Move pointer mode initialization to the same place.
> > >
> > > which I _did_ apply to 4.8.  Sorry about the confusion instead...
> >
> > It's ok, didn't have time to act on the confusion anyway :)
> >
> > So is the proposed one ok to backport? I've bootstrapped it on
> > arm-none-linux-gnueabihf and tested on arm-none-eabi with qemu.
> >
> >
> > Thanks,
> > Kyrill
> >
> >
> >
> >
> 
> 
> 
> 






Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Chung-Lin Tang
On 13/8/5 10:06 PM, Mike Stump wrote:
> On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:
>> On 13/7/15 1:43 AM, Diego Novillo wrote:
>>> Could you please repost the patch with its description?  This thread
>>> is sufficiently old and noisy that I'm not even sure what the patch
>>> does nor why.
>>
>> Taking the same example in my first post:
>>
>>  Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
>> recognize the "PIC-reg + " form of load as a weak symbol; it
>> returns 'true' immediately after seeing the pic-reg indexing, and does
>> not test the wrapped symbol for DECL_WEAK.
> 
> So, I can't help but think that others would say that looking into an unspec 
> is by nature, the wrong way to do it, unless that code is in the port.
> 
> I think the followup from Bernhard points to a better solution, though the 
> wording in the comment was objectionable.  Merely say that the symbol, if 
> weak and not defined, is then not local.

When I last tested that patch which moves the DECL_WEAK check, the
testcases for C++ TLS wrappers fail. I don't remember the fine details,
but effectively it filters out the TLS wrappers from being treated
locally, causing them to be called through @PLT, and regressing on some
tests specifically checking for that...

The UNSPEC interpretation here is fairly restricted, FWIW. Earlier talk
on this thread also mentioned that maybe specific RTL constructs for
reasoning about PIC addresses should be introduced, rather than common
idiomatic pattern, though that may be a long shot for now.

Chung-Lin



Re: [PATCH] Caller instrumentation with -finstrument-calls [PING^3]

2013-08-05 Thread Woegerer, Paul
...just about time for another ping on GCC caller instrumentation:

http://gcc.gnu.org/ml/gcc-patches/2013-07/msg7.html

Thanks,
Paul

On 07/10/2013 04:43 PM, wrote:
> Ping,
>
> The updated patch that I have sent here:
> http://gcc.gnu.org/ml/gcc-patches/2013-07/msg7.html
> is still pending review/acceptance.
>
> Could someone please have a look.
>
> Many Thanks,
> Paul
>
> On Monday 01 July 2013 10:22:02 Woegerer, Paul wrote:
>> Hi Andrew,
>>
>> On Friday 28 June 2013 09:50:31 Andrew Pinski wrote:
>>> On Fri, Jun 28, 2013 at 5:51 AM,   wrote:
 Hi,

 The patch below provides caller instrumentation for GCC.

 The following new options have been added:
 -finstrument-calls
 -finstrument-calls-exclude-function-list=SYM,SYM,...
 -finstrument-calls-exclude-file-list=FILE,FILE,...

 These new options behave and appear similar to the existing function
 instrumentation options (-finstrument-functions*). I have also added
 attribute no_instrument_calls to specify which functions should be
 excluded from within the source code. Calls to functions that have
 attribute no_instrument_function are also excluded.

 The effect of the instrumentation causes all calls inside a function

 to get instrumented using the following hooks:
 void __cyg_profile_call_before (void *fn);
 void __cyg_profile_call_after  (void *fn);
>>> Can you not use cyg as the prefix rather use gcc or gnu.  cyg prefix
>>> for -finstrument-functions is a historical accident as it stands for
>>> cygnus but that company no longer exists and we should not be using a
>>> company specific prefix inside of GCC anymore.
>> Ha, funny. Originally I had the hooks with __gnu_profile_call_* but
>> then I changed them to be consistent with the existing __cyg_profile_*
>> hooks.
>>
>> Anyway, here is the updated patch:
>>
>>
>> From 43a1c2fb43e406f8f547dbcde19a60a8c56423a4 Mon Sep 17 00:00:00 2001
>> From: Paul Woegerer 
>> Date: Mon, 1 Jul 2013 09:15:21 +0200
>> Subject: [PATCH] Caller instrumentation with -finstrument-calls.
>>
>> 2013-07-01  Paul Woegerer  
>>
>>  Caller instrumentation with -finstrument-calls.
>>  * gcc/builtins.def: Add call-hooks __gnu_profile_call_before and
>>  __gnu_profile_call_after.
>>  * gcc/libfuncs.h (enum libfunc_index): Likewise.
>>  * gcc/optabs.c (init_optabs): Likewise.
>>  * gcc/c-family/c-common.c (no_instrument_calls): Add attribute.
>>  (handle_no_instrument_calls_attribute): New.
>>  * gcc/common.opt (finstrument-calls): New option.
>>  (finstrument-calls-exclude-function-list): Likewise.
>>  (finstrument-calls-exclude-file-list): Likewise.
>>  * gcc/opts.c (common_handle_option): Handle new options.
>>  * gcc/tree.h (tree_function_decl): Add field tree_function_decl.
>>  * gcc/c/c-decl.c (merge_decls): Handle tree_function_decl field.
>>  * gcc/cp/decl.c (duplicate_decls): Likewise.
>>  * gcc/function.c (expand_function_start): Likewise.
>>  * gcc/ipa.c: Likewise.
>>  * gcc/java/jcf-parse.c: Likewise.
>>  * gcc/tree-streamer-in.c: Likewise.
>>  * gcc/tree-streamer-out.c: Likewise.
>>  (finstrument-calls-exclude-function-list): Likewise.
>>  (finstrument-calls-exclude-file-list): Likewise.
>>  * gcc/gimplify.c (flag_instrument_calls_exclude_p): New.
>>  (addr_expr_for_call_instrumentation): New.
>>  (maybe_add_profile_call): New.
>>  (gimplify_call_expr): Add call-hooks insertion.
>>  (gimplify_modify_expr): Likewise.
>>  * gcc/doc/invoke.texi: Added documentation for
>>  -finstrument-calls-exclude-function-list and
>>  -finstrument-calls-exclude-file-list and
>>  -finstrument-calls.
>>  * gcc/testsuite/g++.dg/other/instrument_calls-1.C  Added
>>   regression test for -finstrument-calls.
>>  * gcc/testsuite/g++.dg/other/instrument_calls-2.C: Likewise.
>>  * gcc/testsuite/g++.dg/other/instrument_calls-3.C: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-1.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-2.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-3.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-4.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-5.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-6.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-7.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-8.c: Likewise.
>>  * gcc/testsuite/gcc.dg/instrument_calls-9.c: Likewise.
>>
>> diff --git a/gcc/builtins.def b/gcc/builtins.def
>> index 9b55b1f..0c2a6b2 100644
>> --- a/gcc/builtins.def
>> +++ b/gcc/builtins.def
>> @@ -795,6 +795,11 @@ DEF_BUILTIN (BUILT_IN_PROFILE_FUNC_ENTER, 
>> "__cyg_profile_func_enter", BUILT_IN_N
>>  DEF_BUILTIN (BUILT_IN_PROFILE_FUNC_EXIT, "__cyg_profile_func_exit", 
>> BUILT_IN_NORMAL, BT_FN_VOID_PTR_PTR, BT_LAST,
>>   false, false, false, ATTR_NULL, true, 

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-05 Thread Jan Hubicka
The patch looks OK to me in general (I can not approve it).
Still have one question...
> +
> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
> +   post-dominance DIR, for example as a result of edge weight insanities.
> +   Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
> +   to BBS_IN_HOT_PARTITION.  */
> +
> +static unsigned int
> +sanitize_dominator_hotness (enum cdi_direction dir, unsigned int 
> cold_bb_count,
> +vec *bbs_in_hot_partition)
> +{
> +  /* Callers check this.  */
> +  gcc_checking_assert (cold_bb_count);
> +
> +  bool dom_calculated_here = !dom_info_available_p (dir);
> +
> +  if (dom_calculated_here)
> +calculate_dominance_info (dir);
> +
> +  /* Keep examining hot bbs while we still have some left to check
> + and there are remaining cold bbs.  */
> +  vec hot_bbs_to_check = bbs_in_hot_partition->copy ();
> +  while (! hot_bbs_to_check.is_empty ()
> + && cold_bb_count)
> +{
> +  basic_block bb = hot_bbs_to_check.pop ();
> +  basic_block dom_bb = get_immediate_dominator (dir, bb);
> +
> +  /* If bb's immediate dominator is also hot (or unpartitioned,
> + e.g. the entry block) then it is ok. If it is cold, it
> + needs to be adjusted.  */
> +  if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
> +continue;

What will hapepn on

if (t)
  something
else
  something else
for (i=0;i<100;i++)
  something else2

I would expect if/something and something else to be cold by profile feedback.
Your dominator code will bring the if into hot partition but both paths
of conditional will be cold, so the number of crossings will actually grow.

If we want to have at least some path to hot blocks in the hot region, I suspect
we could walk back from hot regions to entry and keep those in hot regions 
rather
than relying on the dominator tree...
But I am sure such things can be dealt with incrementally.

Honza


Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Mike Stump
[ sorry for the dup ]

On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:

> On 13/7/15 1:43 AM, Diego Novillo wrote:
>> Could you please repost the patch with its description?  This thread
>> is sufficiently old and noisy that I'm not even sure what the patch
>> does nor why.
> 
> Taking the same example in my first post:

> Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
> recognize the "PIC-reg + " form of load as a weak symbol; it
> returns 'true' immediately after seeing the pic-reg indexing, and does
> not test the wrapped symbol for DECL_WEAK.

So, I can't help but think that others would say that looking into an unspec is 
by nature, the wrong way to do it, unless that code is in the port.

I think the followup from Bernhard points to a better solution, though the 
wording in the comment was objectionable.  Merely say that the symbol, if weak 
and not defined, is then not local.

Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]

2013-08-05 Thread Mike Stump
On Aug 4, 2013, at 8:14 AM, Chung-Lin Tang  wrote:
> On 13/7/15 1:43 AM, Diego Novillo wrote:
>> Could you please repost the patch with its description?  This thread
>> is sufficiently old and noisy that I'm not even sure what the patch
>> does nor why.
> 
> Taking the same example in my first post:
> 
>  Under -fPIC, the code in rtlanal.c:nonzero_address_p() does not properly
> recognize the "PIC-reg + " form of load as a weak symbol; it
> returns 'true' immediately after seeing the pic-reg indexing, and does
> not test the wrapped symbol for DECL_WEAK.

So, I can't help but think that others would say that looking into an unspec is 
by nature, the wrong way to do it, unless that code is in the port.

I think the followup from Bernhard points to a better solution, though the 
wording in the comment was objectionable.  Merely say that the symbol, if weak 
and not defined, is then not local.

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-05 Thread Teresa Johnson
On Fri, Aug 2, 2013 at 9:48 PM, Teresa Johnson  wrote:
> On Fri, Aug 2, 2013 at 8:05 AM, Jan Hubicka  wrote:
>>>
>>> 2013-08-01  Teresa Johnson  
>>> Steven Bosscher  
>>>
>>> * cfgrtl.c (fixup_bb_partition): New routine.
>>> (commit_edge_insertions): Invoke fixup_partitions.
>>> (find_partition_fixes): New routine.
>>> (fixup_partitions): Ditto.
>>> (verify_hot_cold_block_grouping): Update comments.
>>> (rtl_verify_edges): Invoke find_partition_fixes.
>>> (rtl_verify_bb_pointers): Update comments.
>>> (rtl_verify_bb_layout): Ditto.
>>> * basic-block.h (fixup_partitions): Declare.
>>> * cfgcleanup.c (try_optimize_cfg): Invoke fixup_partitions.
>>> * bb-reorder.c (sanitize_dominator_hotness): New function.
>>> (find_rarely_executed_basic_blocks_and_crossing_edges): Invoke
>>> sanitize_dominator_hotness.
>>>
>>> Index: cfgrtl.c
>>> ===
>>> --- cfgrtl.c(revision 201281)
>>> +++ cfgrtl.c(working copy)
>>> @@ -1341,6 +1341,34 @@ fixup_partition_crossing (edge e)
>>>  }
>>>  }
>>>
>>> +/* Called when block BB has been reassigned to a different partition,
>>> +   to ensure that the region crossing attributes are updated.  */
>>> +
>>> +static void
>>> +fixup_bb_partition (basic_block bb)
>>> +{
>>> +  edge e;
>>> +  edge_iterator ei;
>>> +
>>> +  /* Now need to make bb's pred edges non-region crossing.  */
>>> +  FOR_EACH_EDGE (e, ei, bb->preds)
>>> +{
>>> +  fixup_partition_crossing (e);
>>> +}
>>> +
>>> +  /* Possibly need to make bb's successor edges region crossing,
>>> + or remove stale region crossing.  */
>>> +  FOR_EACH_EDGE (e, ei, bb->succs)
>>> +{
>>> +  if ((e->flags & EDGE_FALLTHRU)
>>> +  && BB_PARTITION (bb) != BB_PARTITION (e->dest)
>>> +  && e->dest != EXIT_BLOCK_PTR)
>>> +force_nonfallthru (e);
>>> +  else
>>> +fixup_partition_crossing (e);
>>> +}
>>> +}
>>
>> Is there particular reason why preds can not be fallhtrus and why
>> force_nonfallthru edge does not need partition crossing fixup?
>> (if so, perhpas it could be mentioned in the description, if not,
>> I think force_nonfallthru path has to check if new BB was introduced
>> and do the right thing on the edge.
>
> I need to clarify the comments in this routine, because without the
> context of how this is called it isn't clear. This routine is only
> called when we detect a hot bb that is now dominated by a cold bb and
> needs to become cold. Therefore, its preds will no longer be region
> crossing (any non-dominating blocks that were previously hot would
> have been marked cold in the caller for the same reason, so we will
> not end up adjusting the region crossing-ness or fallthrough-ness of
> those pred edges). Any that were region crossing before but aren't any
> longer could not have been fall through (as Steven noted, you can't
> have a fall through across a partition boundary). I will add some
> better comments here.
>
> Regarding the call to force_nonfallthru, that routine calls
> fixup_partition_crossing as needed, and I will update the comment to
> reflect that too.

Patch with updated comments below. Ok for trunk?

Thanks,
Teresa

2013-08-05  Teresa Johnson  
Steven Bosscher  

* cfgrtl.c (fixup_bb_partition): New routine.
(commit_edge_insertions): Invoke fixup_partitions.
(find_partition_fixes): New routine.
(fixup_partitions): Ditto.
(verify_hot_cold_block_grouping): Update comments.
(rtl_verify_edges): Invoke find_partition_fixes.
(rtl_verify_bb_pointers): Update comments.
(rtl_verify_bb_layout): Ditto.
* basic-block.h (fixup_partitions): Declare.
* cfgcleanup.c (try_optimize_cfg): Invoke fixup_partitions.
* bb-reorder.c (sanitize_dominator_hotness): New function.
(find_rarely_executed_basic_blocks_and_crossing_edges): Invoke
sanitize_dominator_hotness.

Index: cfgrtl.c
===
--- cfgrtl.c(revision 201461)
+++ cfgrtl.c(working copy)
@@ -1341,6 +1341,43 @@ fixup_partition_crossing (edge e)
 }
 }

+/* Called when block BB has been reassigned to the cold partition,
+   because it is now dominated by another cold block,
+   to ensure that the region crossing attributes are updated.  */
+
+static void
+fixup_new_cold_bb (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  /* This is called when a hot bb is found to now be dominated
+ by a cold bb and therefore needs to become cold. Therefore,
+ its preds will no longer be region crossing. Any non-dominating
+ preds that were previously hot would also have become cold
+ in the caller for the same region. Any preds that were previously
+ region-crossing will be adjusted in fixup_partition_crossing.  */
+  FOR_EACH_EDGE (e,

Re: Request to merge Undefined Behavior Sanitizer in (take 2)

2013-08-05 Thread Marek Polacek
On Sat, Aug 03, 2013 at 12:24:32PM -0400, Jason Merrill wrote:
> Where are the SAVE_EXPRs coming from?  It doesn't seem to me that x
> needs to be wrapped in a SAVE_EXPR at all in this case.  For cases
> where the SAVE_EXPR is needed and not used in the test, you could
> add the SAVE_EXPR before the condition with a COMPOUND_EXPR.

Those SAVE_EXPRs are coming from c-typeck.c in this case:

  if (flag_sanitize & SANITIZE_UNDEFINED
  && current_function_decl != 0
  && (doing_div_or_mod || doing_shift))
{
  /* OP0 and/or OP1 might have side-effects.  */
  op0 = c_save_expr (op0);
  op1 = c_save_expr (op1);

I've resorted to creating (SAVE_EXPR , ...)  in the condition in case
we're in C89, i.e. creating COMPOUND_EXPR in the condition itself,
it seems to work pretty well.  Firstly, I added explicit (void) cast via
NOP_EXPR (as 'if ((void) x, y)' doesn't trigger -Wunused-value warning),
but that didn't seem to be necessary...

Does that look up to snuff to you?  Thanks,

2013-08-05  Marek Polacek  

* c-ubsan.c (ubsan_instrument_shift): Properly evaluate
SAVE_EXPR even in the C89 mode.

* gcc.dg/ubsan/save-expr-1.c: New test.

--- gcc/c-family/c-ubsan.c.mp   2013-08-05 13:13:49.245693141 +0200
+++ gcc/c-family/c-ubsan.c  2013-08-05 13:13:53.170709181 +0200
@@ -131,6 +131,12 @@ ubsan_instrument_shift (location_t loc,
   build_int_cst (type0, 0));
   tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, x, tt);
 }
+
+  /* In case we have a SAVE_EXPR in a conditional context, we need to
+ make sure it gets evaluated before the condition.  */
+  if (!c_dialect_cxx () && !flag_isoc99)
+t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
+
   tree data = ubsan_create_data ("__ubsan_shift_data",
 loc, ubsan_type_descriptor (type0),
 ubsan_type_descriptor (type1), NULL_TREE);
--- gcc/testsuite/gcc.dg/ubsan/save-expr-1.c.mp 2013-08-05 13:14:51.057960067 
+0200
+++ gcc/testsuite/gcc.dg/ubsan/save-expr-1.c2013-08-05 13:17:44.582675494 
+0200
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=shift -std=c89 -Wall -Werror -O" } */
+
+static int x;
+int
+main (void)
+{
+  int o = 1;
+  int y = x << o;
+  return y;
+}

Marek


Re: C++ coding conventions: namespaces, references and getters (was Re: [PATCH 2/2] Introduce beginnings of a pipeline class.)

2013-08-05 Thread Martin Jambor
Hi,

On Fri, Aug 02, 2013 at 11:30:01PM -0500, Gabriel Dos Reis wrote:
> [ Adding Benjamin, Diego, Lawrence ]
> 
> General remarks first:
> When we designed the coding standards for GCC, an overriding
> philosophy was that we did not want to be prescriptive.  Rather, we
> explicitly wanted to encourage common sense and trust the judgment
> of maintainers to make sound and appropriate judgments.
> 
> More than a year later, I still believe that was the right thing to do
> and I would not want us to start being prescriptive.
> 
> 
> It is a "common wisdom" among C++ programmers that using directives
> in header files generally negate the benefits of namespaces, and can
> lead to surprises.  This is because most of the time, people don't go
> scrutinize header files; they include them (based on documentation
> of declared signatures) in many translation units.

Perhaps it was a mistake of mine but I never intended to discuss using
"using" in header files.  I was basically wondering whether we should
start a C file with "using namespace gcc" instead of sprinkling the
source below with "gcc::" qualifiers.  I suppose it probably depends
on what we intend to move in that particular namespace (or into
another namespace in general but I think that we should probably
discuss this only after a great deal of Andrew's re-architecting
actually happens and we see the clearer interfaces that will hopefully
emerge).

> 
> There is one notable exception to this, which is known as
> "namespace composition":
> 
>   
> http://stackoverflow.com/questions/16871390/why-is-namespace-composition-so-rarely-used
> 
> We should probably amend the coding standards and include that hint,
> but I don't feel strongly about it.
> 
> >
> > These aren't the GCC coding conventions, but the Google conventions:
> > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Namespaces
> > forbid using directives of the form:
> >   using NAMESPACE;
> > but suggest using directives of the form:
> >   using NAMESPACE::NAME;
> > to pick out individual names from a namespace, though *not* in global
> > scope in a header (to avoid polluting the global namespace).
> 
> The using declarations are fine, as indicated.
> 
> >
> > I like this approach - how about using it for frequently used names in
> > a .c/.cc file, keeping the names alphabetizing by "fully qualified
> > path", so e.g.:
> >
> >   #include "foo.h"
> >   ...
> >   #include "bar.h"
> >
> >   using gcc::context;
> >   using gcc::pass_manager;
> >   ...etc, individually grabbing the names we'll be needing
> >
> >   // code follows
> >
> > and thus avoiding grabbing whole namespaces, whilst keeping the code
> > concise.
> 
> Agreed.

If that's the general feeling, I'll comply.  With only one big gcc
namespace, I do not see the point though.  Once we'll divide our
namespace, it will make sense.  (But as I wrote above, I think that
it will take and should take some time before we get there.)

> 
> […]
> 
> > Note that as per
> >   http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01365.html
> > we'll use "pass_manager" rather than "pipeline", so this would look
> > like:
> >   pass_manager &get_passes () { gcc_assert (passes_); return *passes_; }
> >
> > We were chatting about C++ references on IRC on Friday, and IIRC there
> > was a strong objection to passing *arguments* that are non-const
> > references,
> 
> I hope nobody is objecting to std::swap for example.

I'm not sure how I could object to std::swap or what good would that
be.  Apart from being the standard, it is also rather exceptional due
to its nature and its name.  Despite this special example, I think the
rule of not passing arguments as non-const references is a good one,
and it seems I am not alone.  Where I may differ from others is that I
also think that getter functions should (generally) return pointers or
const references too, and for similar reasons - the caller may
unintentionally modify data that the programmers thought were on the
stack but were in fact elsewhere else and "belonged" to someplace
else.

Nevertheless, I do not want spend too much time arguing about this, it
seems I have not persuaded people that it will lead to bugs which will
be hard to debug and so let's drop this discussion and hope that I am
wrong (though I will swear a lot if I ever I stumble across such a
bug).

Martin


Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0

2013-08-05 Thread Andrew Pinski
On Mon, Aug 5, 2013 at 1:39 AM, Andrew Pinski  wrote:
> On Mon, Aug 5, 2013 at 1:08 AM, Zhenqiang Chen  wrote:
>> Hi
>>
>> The patch reassociates X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1
>> into ((X - CST1) & ~(CST2 - CST1)) == 0.
>>
>> Bootstrap on x86-64 and ARM chromebook.
>> No make check regression on x86-64 and panda board.
>>
>> For some targets/options, the "(X == CST1 || X == CST2)" might be converted
>> to "if (x == CST1) else if (x == CST2)" at the beginning. In such case, the
>> patch will not be executed. It is hard to write a reliable testcase to
>> detect the transformation. So the testcase does not check the dump logs from
>> reassoc1 pass. It just checks the runtime result.
>>
>> Is it OK for trunk?
>
> Seems like a better place to put this is inside tree-ssa-ifcombine.c
> which handles the case where if(a || b) is split up into if(a) else
> if(b).
> Moving it into tree-ssa-ifcombine.c allows for code to be optimized
> which was written using the if(a) else if(b) form.
>
> Then it would easier to detect this for all targets and easier to
> remove from fold-const.c the removal of the short circuiting.

The other place where it make sense to do this optimization is the
gimple combiner (right now that is really tree-ssa-forwprop.c).
Though it would better sense if reassoc and ifcombine uses the same
functions as the combiner (see my RFC which I don't have much time to
work on right now).

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew Pinski
>
>>
>> Thanks!
>> -Zhenqiang
>>
>> ChangeLog
>> 2013-08-05  Zhenqiang Chen  
>>
>> * tree-ssa-reassoc.c (optimize_range_tests): Reasociate
>> X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into
>> ((X - CST1) & ~(CST2 - CST1)) == 0.
>>
>> testsuite/ChangeLog
>> 2013-08-05  Zhenqiang Chen  
>>
>> * gcc.dg/reassoc1.c: New testcase.


Re: New parameters to control stringop expansion libcall strategy

2013-08-05 Thread Michael V. Zolotukhin
Hi,
This is a really convenient option, thanks for working on it.
I can't approve it as I'm not a maintainer, but it looks ok to me,
except fot a small nitpicking: afair, comments should end with
dot-space-space.

Michael

On 04 Aug 20:01, Xinliang David Li wrote:
> The attached is a new patch implementing the stringop inline strategy
> control using two new -m options:
> 
> -mmemcpy-strategy=
> -mmemset-strategy=
> 
> See changes in doc/invoke.texi for description of the new options. Example:
>   
> -mmemcpy-strategy=rep_8byte:64:unaligned,unrolled_loop:2048:unaligned,libcall:-1:unaligned
> 
> tells compiler to inline memcpy using rep_8byte when the size is no
> larger than 64 byte, using unrolled_loop when size is no larger than
> 2048, and for size > 2048, using library call. In all cases,
> destination alignment adjustment is not done.
> 
> Tested on x86-64/linux. Ok for trunk?
> 
> thanks,
> 
> David
> 
> 2013-08-02  Xinliang David Li  
> 
> * config/i386/stringop.def: New file.
> * config/i386/stringop.opt: New file.
> * config/i386/i386-opts.h: Include stringopt.def.
> * config/i386/i386.opt: Include stringopt.opt.
> * config/i386/i386.c (ix86_option_override_internal):
> Override default size based stringop inline strategies
> with options.
> * config/i386/i386.c (ix86_parse_stringop_strategy_string):
> New function.
> 
> 2013-08-04  Xinliang David Li  
> 
> * testsuite/gcc.target/i386/memcpy-strategy-1.c: New test.
> * testsuite/gcc.target/i386/memcpy-strategy-2.c: Ditto.
> * testsuite/gcc.target/i386/memset-strategy-1.c: Ditto.
> * testsuite/gcc.target/i386/memcpy-strategy-3.c: Ditto.
> 
> 
> 
> 
> On Fri, Aug 2, 2013 at 9:21 PM, Xinliang David Li  wrote:
> > On x86_64, when the expected size of memcpy/memset is known (e.g, with
> > FDO), libcall strategy is used with the size is > 8192. This value is
> > hard coded, which makes it hard to do performance tuning. This patch
> > adds two new parameters to do that. Potential usage includes
> > per-application libcall strategy min-size tuning based on summary data
> > with FDO (e.g, instruction workset size).
> >
> > Bootstrap and tested on x86_64/linux. Ok for trunk?
> >
> > thanks,
> >
> > David
> >
> >
> > 2013-08-02  Xinliang David Li  
> >
> > * params.def: New parameters.
> > * config/i386/i386.c (ix86_option_override_internal):
> > Override default libcall size limit with parameters.

> Index: config/i386/stringop.def
> ===
> --- config/i386/stringop.def  (revision 0)
> +++ config/i386/stringop.def  (revision 0)
> @@ -0,0 +1,42 @@
> +/* Definitions for option handling for IA-32.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +Under Section 7 of GPL version 3, you are granted additional
> +permissions described in the GCC Runtime Library Exception, version
> +3.1, as published by the Free Software Foundation.
> +
> +You should have received a copy of the GNU General Public License and
> +a copy of the GCC Runtime Library Exception along with this program;
> +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +.  */
> +
> +DEF_ENUM
> +DEF_ALG (no_stringop, no_stringop)
> +DEF_ENUM
> +DEF_ALG (libcall, libcall)
> +DEF_ENUM
> +DEF_ALG (rep_prefix_1_byte, rep_byte)
> +DEF_ENUM
> +DEF_ALG (rep_prefix_4_byte, rep_4byte)
> +DEF_ENUM
> +DEF_ALG (rep_prefix_8_byte, rep_8byte)
> +DEF_ENUM
> +DEF_ALG (loop_1_byte, byte_loop)
> +DEF_ENUM
> +DEF_ALG (loop, loop)
> +DEF_ENUM
> +DEF_ALG (unrolled_loop, unrolled_loop)
> +DEF_ENUM
> +DEF_ALG (vector_loop, vector_loop)
> Index: config/i386/i386.opt
> ===
> --- config/i386/i386.opt  (revision 201458)
> +++ config/i386/i386.opt  (working copy)
> @@ -316,6 +316,14 @@ mstack-arg-probe
>  Target Report Mask(STACK_PROBE) Save
>  Enable stack probing
>  
> +mmemcpy-strategy=
> +Target RejectNegative Joined Var(ix86_tune_memcpy_strategy)
> +Specify memcpy expansion strategy when expected size is known
> +
> +mmemset-strategy=
> +Target RejectNegative Joined Var(ix86_tune_memset_strategy)
> +Specify memset expansion strategy when expected size is known
> +
>  mstringop-strategy=
>  Target RejectNegative Joined Enum(stringop_alg) Var(ix86_stringop_alg) 
> Init(no_stringop)
>  Chose strategy to generate

[testsuite, Android] Add to pr56407.c

2013-08-05 Thread Alexander Ivchenko
Hi,

The following test case fails to compile on Android: gcc.dg/torture/pr56407.c

/tmp/ccA08Isw.o:pr56407.c:function test: error: undefined reference to 'rand'
collect2: error: ld returned 1 exit status

Which is not surprising at all, since the testcase has only the
declarations of abort() and rand()
and doesn't have any headers included.

The following patch adds  to the test.
Tested on x86_64-unknown-linux-gnu and on Android. Also I checked that
the original regression appears on the fixed test.

Is it OK for trunk?

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f356d55..5c23650 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2013-08-05  Alexander Ivchenko  
+
+   * gcc.dg/torture/pr56407.c: Add include of stdlib.h. Remove
+   declaration of abort() and rand().
+
 2013-08-04  Ed Smith-Rowland  <3dw...@verizon.net>

PR c++/58072
diff --git a/gcc/testsuite/gcc.dg/torture/pr56407.c
b/gcc/testsuite/gcc.dg/torture/pr56407.c
index f26fd23..5d35874 100644
--- a/gcc/testsuite/gcc.dg/torture/pr56407.c
+++ b/gcc/testsuite/gcc.dg/torture/pr56407.c
@@ -1,7 +1,6 @@
 /* { dg-do run } */

-extern void abort(void);
-extern int rand(void);
+#include 

 static void copy(int *r,int *a,int na)
 {


--Alexander


[PATCH] Correctly validate free registers for peep2

2013-08-05 Thread Richard Earnshaw
PR 57708 is a bug where peep2_find_free_register is incorrectly
returning a register that clobbers an unsaved callee saved register.
The problem is due to the way it validates register liveness: only the
first register in the list is fully validated.  In this particular case
the problem is that the second register has not been saved in the
prologue, but all of the tests except the suitability for the mode need
to be performed for each register.

*recog.c (peep2_find_free_register): Validate all regs in a
multi-reg mode.

Bootstrapped on x86_64.

Ok for trunk and 4.8?  (4.7 is also affected, but I don't know of any
back-end relies on this at that point).

R.--- recog.c (revision 201440)
+++ recog.c (local)
@@ -3124,32 +3125,53 @@ peep2_find_free_register (int from, int 
   regno = raw_regno;
 #endif
 
-  /* Don't allocate fixed registers.  */
-  if (fixed_regs[regno])
-   continue;
-  /* Don't allocate global registers.  */
-  if (global_regs[regno])
-   continue;
-  /* Make sure the register is of the right class.  */
-  if (! TEST_HARD_REG_BIT (reg_class_contents[cl], regno))
-   continue;
-  /* And can support the mode we need.  */
+  /* Can it support the mode we need?  */
   if (! HARD_REGNO_MODE_OK (regno, mode))
continue;
-  /* And that we don't create an extra save/restore.  */
-  if (! call_used_regs[regno] && ! df_regs_ever_live_p (regno))
-   continue;
-  if (! targetm.hard_regno_scratch_ok (regno))
-   continue;
-
-  /* And we don't clobber traceback for noreturn functions.  */
-  if ((regno == FRAME_POINTER_REGNUM || regno == HARD_FRAME_POINTER_REGNUM)
- && (! reload_completed || frame_pointer_needed))
-   continue;
 
   success = 1;
-  for (j = hard_regno_nregs[regno][mode] - 1; j >= 0; j--)
+  for (j = 0; success && j < hard_regno_nregs[regno][mode]; j++)
{
+ /* Don't allocate fixed registers.  */
+ if (fixed_regs[regno + j])
+   {
+ success = 0;
+ break;
+   }
+ /* Don't allocate global registers.  */
+ if (global_regs[regno + j])
+   {
+ success = 0;
+ break;
+   }
+ /* Make sure the register is of the right class.  */
+ if (! TEST_HARD_REG_BIT (reg_class_contents[cl], regno + j))
+   {
+ success = 0;
+ break;
+   }
+ /* And that we don't create an extra save/restore.  */
+ if (! call_used_regs[regno + j] && ! df_regs_ever_live_p (regno + j))
+   {
+ success = 0;
+ break;
+   }
+
+ if (! targetm.hard_regno_scratch_ok (regno + j))
+   {
+ success = 0;
+ break;
+   }
+
+ /* And we don't clobber traceback for noreturn functions.  */
+ if ((regno + j == FRAME_POINTER_REGNUM
+  || regno + j == HARD_FRAME_POINTER_REGNUM)
+ && (! reload_completed || frame_pointer_needed))
+   {
+ success = 0;
+ break;
+   }
+
  if (TEST_HARD_REG_BIT (*reg_set, regno + j)
  || TEST_HARD_REG_BIT (live, regno + j))
{
@@ -3157,6 +3179,7 @@ peep2_find_free_register (int from, int 
  break;
}
}
+
   if (success)
{
  add_to_hard_reg_set (reg_set, mode, regno);

Re: [Patch, Fortran, OOP] PR 57306: ICE on valid with class pointer initialization

2013-08-05 Thread Janus Weil
>> Looking at gfc_class_initializer, I have the impression that it does not
>> handle initialization of unlimited polymorphic variables/components. I don't
>> know whether initialization is permitted, but my feeling is that the
>> following should work:
>>
>> type t
>>   class(*), pointer :: x
>> end type t
>> type(t), target :: y
>> integer,target :: z
>> type(t) :: x = t(y)
>> type(t) :: x = t(z)
>> class(*), pointer :: a => y
>>
>> And I have the feeling that it is not correctly treated - but I might be
>> wrong. (Note to the example above: I believe "t(y)" is a valid structure
>> constructor, which is not yet supported. But again, I might be wrong about
>> either.)
>
> Your example yields (with and without the current patch):
>
> tobias2.f90:7.17:
>
> type(t) :: x = t(y)
>  1
> Error: Can't convert TYPE(t) to CLASS(*) at (1)
> tobias2.f90:8.17:
>
> type(t) :: x = t(z)
>  1
> Error: Can't convert INTEGER(4) to CLASS(*) at (1)

In fact this problem is more similar to PR 49213 than the one under
discussion here. I have added the above test case to PR 49213 as
comment 12.

Cheers,
Janus


Clean up pretty printers [6/n]

2013-08-05 Thread Gabriel Dos Reis

Same topics as from previous patch; this time for the graphiz outputter.

-- Gaby

2013-08-05  Gabriel Dos Reis  
 
* graph.c (init_graph_slim_pretty_print): Remove.
(print_graph_cfg): Do not call it.  Use local pretty printer.
(start_graph_dump): Likewise.

Index: gcc/graph.c
===
--- gcc/graph.c (revision 201481)
+++ gcc/graph.c (working copy)
@@ -56,26 +56,6 @@
   return fp;
 }
 
-/* Return a pretty-print buffer for output to file FP.  */
-
-static pretty_printer *
-init_graph_slim_pretty_print (FILE *fp)
-{
-  static bool initialized = false;
-  static pretty_printer graph_slim_pp;
-
-  if (! initialized)
-{
-  pp_construct (&graph_slim_pp, /*prefix=*/NULL, /*linewidth=*/0);
-  initialized = true;
-}
-  else
-gcc_assert (! pp_last_position_in_text (&graph_slim_pp));
-
-  graph_slim_pp.buffer->stream = fp;
-  return &graph_slim_pp;
-}
-
 /* Draw a basic block BB belonging to the function with FUNCDEF_NO
as its unique number.  */
 static void
@@ -297,7 +277,10 @@
 {
   const char *funcname = function_name (fun);
   FILE *fp = open_graph_file (base, "a");
-  pretty_printer *pp = init_graph_slim_pretty_print (fp);
+  pretty_printer graph_slim_pp;
+  pp_construct (&graph_slim_pp, /*prefix=*/NULL, /*linewidth=*/0);
+  graph_slim_pp.buffer->stream = fp;
+  pretty_printer *const pp = &graph_slim_pp;
   pp_printf (pp, "subgraph \"%s\" {\n"
 "\tcolor=\"black\";\n"
 "\tlabel=\"%s\";\n",
@@ -313,7 +296,10 @@
 static void
 start_graph_dump (FILE *fp, const char *base)
 {
-  pretty_printer *pp = init_graph_slim_pretty_print (fp);
+  pretty_printer graph_slim_pp;
+  pp_construct (&graph_slim_pp, /*prefix=*/NULL, /*linewidth=*/0);
+  graph_slim_pp.buffer->stream = fp;
+  pretty_printer *const pp = &graph_slim_pp;
   pp_string (pp, "digraph \"");
   pp_write_text_to_stream (pp);
   pp_string (pp, base);


Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0

2013-08-05 Thread Andrew Pinski
On Mon, Aug 5, 2013 at 1:08 AM, Zhenqiang Chen  wrote:
> Hi
>
> The patch reassociates X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1
> into ((X - CST1) & ~(CST2 - CST1)) == 0.
>
> Bootstrap on x86-64 and ARM chromebook.
> No make check regression on x86-64 and panda board.
>
> For some targets/options, the "(X == CST1 || X == CST2)" might be converted
> to "if (x == CST1) else if (x == CST2)" at the beginning. In such case, the
> patch will not be executed. It is hard to write a reliable testcase to
> detect the transformation. So the testcase does not check the dump logs from
> reassoc1 pass. It just checks the runtime result.
>
> Is it OK for trunk?

Seems like a better place to put this is inside tree-ssa-ifcombine.c
which handles the case where if(a || b) is split up into if(a) else
if(b).
Moving it into tree-ssa-ifcombine.c allows for code to be optimized
which was written using the if(a) else if(b) form.

Then it would easier to detect this for all targets and easier to
remove from fold-const.c the removal of the short circuiting.

Thanks,
Andrew Pinski

>
> Thanks!
> -Zhenqiang
>
> ChangeLog
> 2013-08-05  Zhenqiang Chen  
>
> * tree-ssa-reassoc.c (optimize_range_tests): Reasociate
> X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into
> ((X - CST1) & ~(CST2 - CST1)) == 0.
>
> testsuite/ChangeLog
> 2013-08-05  Zhenqiang Chen  
>
> * gcc.dg/reassoc1.c: New testcase.


[PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0

2013-08-05 Thread Zhenqiang Chen
Hi

The patch reassociates X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1
into ((X - CST1) & ~(CST2 - CST1)) == 0.

Bootstrap on x86-64 and ARM chromebook.
No make check regression on x86-64 and panda board.

For some targets/options, the "(X == CST1 || X == CST2)" might be converted
to "if (x == CST1) else if (x == CST2)" at the beginning. In such case, the
patch will not be executed. It is hard to write a reliable testcase to
detect the transformation. So the testcase does not check the dump logs from
reassoc1 pass. It just checks the runtime result.

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog
2013-08-05  Zhenqiang Chen  

* tree-ssa-reassoc.c (optimize_range_tests): Reasociate
X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into
((X - CST1) & ~(CST2 - CST1)) == 0.

testsuite/ChangeLog
2013-08-05  Zhenqiang Chen  

* gcc.dg/reassoc1.c: New testcase.

reassociate.patch
Description: Binary data


Clean up printer printers [5/n]

2013-08-05 Thread Gabriel Dos Reis

This patch stops the gimple printer from using global pretty printers.  

Applied to trunk.

-- Gaby

2013-08-05  Gabriel Dos Reis  
 
* gimple-pretty-print.c (buffer): Remove.
(initialized): Likewise.
(maybe_init_pretty_print): Likewise.
(print_gimple_stmt): Do not call it.  Use non-static local
pretty_printer variable.
(print_gimple_expr): Likewise.
(print_gimple_seq): Likewise.
(gimple_dump_bb): Likewise.

Index: gimple-pretty-print.c
===
--- gimple-pretty-print.c   (revision 201481)
+++ gimple-pretty-print.c   (working copy)
@@ -36,9 +36,6 @@
 #define INDENT(SPACE)  \
   do { int i; for (i = 0; i < SPACE; i++) pp_space (buffer); } while (0)
 
-static pretty_printer buffer;
-static bool initialized = false;
-
 #define GIMPLE_NIY do_niy (buffer,gs)
 
 /* Try to print on BUFFER a default message for the unrecognized
@@ -52,22 +49,6 @@
 }
 
 
-/* Initialize the pretty printer on FILE if needed.  */
-
-static void
-maybe_init_pretty_print (FILE *file)
-{
-  if (!initialized)
-{
-  pp_construct (&buffer, NULL, 0);
-  pp_needs_newline (&buffer) = true;
-  initialized = true;
-}
-
-  buffer.buffer->stream = file;
-}
-
-
 /* Emit a newline and SPC indentation spaces to BUFFER.  */
 
 static void
@@ -93,7 +74,10 @@
 void
 print_gimple_stmt (FILE *file, gimple g, int spc, int flags)
 {
-  maybe_init_pretty_print (file);
+  pretty_printer buffer;
+  pp_construct (&buffer, NULL, 0);
+  pp_needs_newline (&buffer) = true;
+  buffer.buffer->stream = file;
   pp_gimple_stmt_1 (&buffer, g, spc, flags);
   pp_newline_and_flush (&buffer);
 }
@@ -122,7 +106,10 @@
 print_gimple_expr (FILE *file, gimple g, int spc, int flags)
 {
   flags |= TDF_RHS_ONLY;
-  maybe_init_pretty_print (file);
+  pretty_printer buffer;
+  pp_construct (&buffer, NULL, 0);
+  pp_needs_newline (&buffer) = true;
+  buffer.buffer->stream = file;
   pp_gimple_stmt_1 (&buffer, g, spc, flags);
   pp_flush (&buffer);
 }
@@ -155,7 +142,10 @@
 void
 print_gimple_seq (FILE *file, gimple_seq seq, int spc, int flags)
 {
-  maybe_init_pretty_print (file);
+  pretty_printer buffer;
+  pp_construct (&buffer, NULL, 0);
+  pp_needs_newline (&buffer) = true;
+  buffer.buffer->stream = file;
   dump_gimple_seq (&buffer, seq, spc, flags);
   pp_newline_and_flush (&buffer);
 }
@@ -2279,7 +2269,10 @@
   dump_gimple_bb_header (file, bb, indent, flags);
   if (bb->index >= NUM_FIXED_BLOCKS)
 {
-  maybe_init_pretty_print (file);
+  pretty_printer buffer;
+  pp_construct (&buffer, NULL, 0);
+  pp_needs_newline (&buffer) = true;
+  buffer.buffer->stream = file;
   gimple_dump_bb_buff (&buffer, bb, indent, flags);
 }
   dump_gimple_bb_footer (file, bb, indent, flags);


Re: New branch: ubsan

2013-08-05 Thread Gerald Pfeifer
On Mon, 5 Aug 2013, Marek Polacek wrote:
> Sure, does this patch look ok?

Looks good, thanks!

Gerald


Re: [SPARC] Remove superfluous memory barrier for atomics with TSO

2013-08-05 Thread David Miller
From: Richard Henderson 
Date: Tue, 30 Jul 2013 12:33:30 -1000

> On 07/30/2013 03:31 AM, Eric Botcazou wrote:
>> 2013-07-30  Eric Botcazou  
>> 
>>  * config/sparc/sparc.c (sparc_emit_membar_for_model) : Add
>>  the implied StoreLoad barrier for atomic operations if before.
> 
> Looks good.

This looks fine to me too.