from:"Steven Bosscher"

Re: [00/32] Support multiple ABIs in the same translation unit

2019-09-12 Thread Steven Bosscher

On Wednesday, September 11, 2019, Richard Sandiford <
richard.sandif...@arm.com> wrote:.
>
>
> Sorry for the long write-up.
>
> Richard
>

*thanks* for the long write-up!

Ciao!
Steven

Re: [PATCH] rs6000: Make CSE'ing __tls_get_addr calls possible

2019-03-24 Thread Steven Bosscher

On Sun, Mar 24, 2019 at 12:46 AM Segher Boessenkool wrote:
>
> CSE does not consider calls, not even const calls.  This patch puts a
> REG_EQUAL note on the pseudo we assign the __tls_get_addr result to,
> so that those pseudos can be CSE'd and the extra calls deleted as dead
> code.

Hi Segher,

There were REG_EQUAL notes on these tls calls in the past, but I
recall removing them for one reason or another. So watch out for
fall-out from this patch! ;-)

Ciao!
Steven

Re: [PATCH] Fix PR89150, GC of tree-form bitmaps

2019-02-04 Thread Steven Bosscher

On Mon, Feb 4, 2019 at 2:16 PM Richard Biener  wrote:
> When I introduced tree-form bitmaps I forgot to think about GC.
> The following drops the chain_prev annotation to make the marker
> work for trees.

I don't understand this patch. How are the nodes in a bitmap tree now
to be found for marking?

This patch should cause an ICE on test cases with bitmaps as trees
with ENABLE_GC_ALWAYS_COLLECT.

Ciao!
Steven

Re: [PATCH] Remove a barrier when EDGE_CROSSING is remoed (PR lto/88858).

2019-02-04 Thread Steven Bosscher

On Mon, Feb 4, 2019 at 9:10 AM Martin Liška wrote:
>
> @Honza: PING^1
> >>> + else
> >>> +   {
> >>> + if (PREV_INSN (insn))
> >>> +   SET_NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
> >>> + if (NEXT_INSN (insn))
> >>> +   SET_PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
> >>> +   }
> >>> +   }

This looks like remove_insn().

Ciao!
steven

Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)

2019-01-22 Thread Steven Bosscher

On Fri, Jan 18, 2019 at 5:43 PM David Malcolm wrote:
> (1) What does GCC mean by "ebb" in this context?

In this context, the EBB is what Muchnick would call a trace.

Ciao!
Steven

Re: [PATCH] Add splay-tree "view" for bitmap

2018-10-19 Thread Steven Bosscher

On Fri, Oct 19, 2018 at 8:46 AM Richard Biener <> wrote:
> Yeah. I also noticed some 'obvious' shortcomings in the heuristics...
> I guess in the end well predicted branches in the out of line code are 
> important...

What also would help is to put bitmaps on their own obstack to improve
cache locality.

As for the patch, I never hacked it with "production code" in mind, it
was just a proof of concept. Not all of it is optimal or even safe
as-is. For example you probably should add
"gcc_checking_assert(!(BITMAP)->tree-form)" tests in the
bmp_iter_*_init functions. And perhaps semi-splaying trees work better
for the use cases of GCC (x.f. "Rehabilitation of an unloved child:
semi-splaying"). I implemented classic splay trees because I could not
find a semi-splay tree implementation in any of the usual text books
while classic splay tree implementations were given in all of those
books ;-)

Ciao!
Steven

Re: [PATCH][RFC] Fix PR63155 (some more)

2018-09-19 Thread Steven Bosscher

On Wed, Sep 19, 2018 at 3:06 PM Richard Biener wrote:
> If we'd only had a O(log n) search sparse bitmap implementation ...
> (Steven posted patches to switch bitmap from/to such one but IIRC
> that at least lacked bitmap_first_set_bit).

But bitmap_first_set_bit would be easy to implement. Just take the
left-most node of the splay tree.

Actually all bit-tests would be easy to implement. It's only
enumeration and set operations on the tree-views that would be
complicated fluff (easier to "switch views" than re-implement).

Impressive that you remember >5yr old patches like that ;-)

Ciao!
Steven

Re: [PATCH] Remove verify_no_unreachable_blocks

2018-08-23 Thread Steven Bosscher

On Thu, Aug 23, 2018 at 1:18 PM Richard Biener <> wrote:
> -/* Verify that there are no unreachable blocks in the current function.  */
> -
> -void
> -verify_no_unreachable_blocks (void)
> -{
> -  find_unreachable_blocks ();
> -
> -  basic_block bb;
> -  FOR_EACH_BB_FN (bb, cfun)
> -gcc_assert ((bb->flags & BB_REACHABLE) != 0);

Alternatively, just clear BB_REACHABLE here?

  FOR_EACH_BB_FN (bb, cfun)
{
  gcc_assert ((bb->flags & BB_REACHABLE) != 0);
  bb->flags &= ~BB_REACHABLE;
}

The function is quite useful for debugging, I wouldn't remove it.

Ciao!
Steven

Re: [PATCH 4/4] rs6000: Delete old add+cmp patterns

2018-08-17 Thread Steven Bosscher

On Thu, Aug 16, 2018 at 7:14 PM, Segher Boessenkool <> wrote:
> There are some patterns that recognise the parallel of an add and a
> compare, and split it back to the same two insns.  This apparently
> helped RIOS machines before RTL scheduling existed?  Either way, it
> isn't helpful anymore, and even hurts a tiny bit.  So, delete it.

It's been there since r251 (initial revision).

On the topic of archaeology: Perhaps the time also finally has come to
revisit this comment in rs6000.md:

13065; FIXME: This would probably be somewhat simpler if the Cygnus sibcall
13066; stuff was in GCC. <...>

:-)

Ciao!
Steven

Re: [PATCH 2/4] Switch other switch expansion methods into classes.

2018-06-20 Thread Steven Bosscher

On Tue, Jun 12, 2018 at 10:44 PM, Jeff Law wrote:
> On 06/05/2018 01:15 AM, marxin wrote:
>>
>> + The definition of "much bigger" depends on whether we are
>> + optimizing for size or for speed.  If the former, the maximum
>> + ratio range/count = 3, because this was found to be the optimal
>> + ratio for size on i686-pc-linux-gnu, see PR11823.  The ratio
>> + 10 is much older, and was probably selected after an extensive
>> + benchmarking investigation on numerous platforms.  Or maybe it
>> + just made sense to someone at some point in the history of GCC,
>> + who knows...  */
> "much older" is an understatement.  I believe the magic "10" pre-dates
> my involvement in GCC.  You can find evidence of it as far back as
> gcc-0.9.  I doubt it was extensively benchmarked, and even if it was,
> the targets on which it was benchmarked don't reflect modern target
> reality in terms of importance.

When I added this comment
(https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/stmt.c?r1=189284=189285;)
it as an attempt at humor. I should have turned the number into a
PARAM at the time. Maybe that's something Martin could still do now?

Ciao!
Steven

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-03 Thread Steven Bosscher

On Thu, Aug 3, 2017 at 2:56 PM, Richard Biener wrote:
> I think the main reason for not doing it early is the benefit is small
> (unless it is GIMPLE optimizations triggering)

Agree.

> and we can't get rid of
> switches completely because we eventually have to support casei RTL expansion.
> (and no, computed goto with an array of label addresses at GIMPLE is really
> too ugly ;))

What I would have done, is lower all gswitch statements that are to be
lowered to something other than a tablejump.
So by the time you get to RTL expansion, all remaining gswitch
statements would be tablejump or casesi.

Ciao!
Steven

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-03 Thread Steven Bosscher

On Wed, Aug 2, 2017 at 1:20 PM, Martin Liška wrote:
> Hello.
>
> After some discussions with Honza, I've decided to convert current code in 
> stmt.c that
> is responsible for switch expansion. More precisely, I would like to convert 
> the code
> to expand gswitch statements on tree level. Currently the newly created pass 
> is executed
> at the end of tree optimizations.

Hah, something I promissed myself (and others) to do years ago! Thanks
thanks thanks! :-)

The end of the gimple optimizations is seems late for the lowering.
Before there were gimple optimizations, switch lowering was part of
the first compiler pass (to generate RTL from the front end) *before*
all code transformation passes ("optimizations" ;-). Because the
lowering of switch statements was rewritten as a gimple lowering pass,
it now runs *after* optimizations. But to be honest, I can't think of
any optimization opportunities exposed by lowering earlier than at the
end of gimple optimizations. Years ago there was some RTL jump
threading still done after lowering of small switch statements, but I
can't find the related PRs anymore.

> My plan for future is to inspire in [1] and come up with some more 
> sophisticated switch
> expansions. For that I've been working on a paper where I'll summarize 
> statistics based
> on what I've collected in openSUSE distribution with specially instrumented 
> GCC. If I'll be
> happy I can also fit in to schedule of this year's Cauldron with a talk.

Sayle's paper is a good starting point. Also interesting:

http://llvm.org/devmtg/2017-02-04/Efficient-clustering-of-case-statements-for-indirect-branch-prediction.pdf
About adjusting the size of jump tables to the capabilities of the CPU
branch predictor. Mixed results but something to consider.

Kannan & Proebsting, "CORRECTION TO ‘PRODUCING GOOD CODE FOR THE CASE
STATEMENT’"
About finding "clusers" of given density within the target values of
the switch statement. The clustering algorithm as presented is N^2 in
the number of case labels, but the idea is interesting and a
simplified, cheaper approach may be possible. This is already used in
LLVM, it seems.

The real challenge will be to figure out how to pick from all these
different approaches the right ones to lower a single switch
statement.

Ciao!
Steven

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-03 Thread Steven Bosscher

On Wed, Aug 2, 2017 at 1:51 PM, Richard Biener wrote:
> On Wed, Aug 2, 2017 at 1:20 PM, Martin Liška wrote:
>> Hello.
>>
>> After some discussions with Honza, I've decided to convert current code in 
>> stmt.c that
>> is responsible for switch expansion. More precisely, I would like to convert 
>> the code
>> to expand gswitch statements on tree level. Currently the newly created pass 
>> is executed
>> at the end of tree optimizations.

Hah, something I promissed myself (and others) to do years ago! Thanks
thanks thanks! :-)


>> My plan for future is to inspire in [1] and come up with some more 
>> sophisticated switch
>> expansions. For that I've been working on a paper where I'll summarize 
>> statistics based
>> on what I've collected in openSUSE distribution with specially instrumented 
>> GCC. If I'll be
>> happy I can also fit in to schedule of this year's Cauldron with a talk.

Sayle's paper is a good starting point. Also interesting:



>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Thoughts?
>
> First of all thanks.
>
> I think part of switch expansion moved to switch-conversion some time ago
> (emit_case_bit_tests).  So maybe the full lowering should be in at least
> the same source file and it should maybe applied earlier for a subset of
> cases (very low number of cases for example).
>
> Did you base the code on the RTL expansion code or did you re-write it from
> scratch?
>
> No further comments sofar.
>
> Richard.
>
>> Martin
>>
>> [1] https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf
>>
>> gcc/ChangeLog:
>>
>> 2017-07-31  Martin Liska  
>>
>> * Makefile.in: Add gimple-switch-low.o.
>> * passes.def: Include pass_lower_switch.
>> * stmt.c (dump_case_nodes): Remove and move to
>> gimple-switch-low.c.
>> (case_values_threshold): Likewise.
>> (expand_switch_as_decision_tree_p): Likewise.
>> (emit_case_decision_tree): Likewise.
>> (expand_case): Likewise.
>> (balance_case_nodes): Likewise.
>> (node_has_low_bound): Likewise.
>> (node_has_high_bound): Likewise.
>> (node_is_bounded): Likewise.
>> (emit_case_nodes): Likewise.
>> * timevar.def: Add TV_TREE_SWITCH_LOWERING.
>> * tree-pass.h: Add make_pass_lower_switch.
>> * gimple-switch-low.c: New file.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-07-31  Martin Liska  
>>
>> * gcc.dg/tree-prof/update-loopch.c: Scan patterns in
>> switchlower.
>> * gcc.dg/tree-ssa/vrp104.c: Likewise.
>> ---
>>  gcc/Makefile.in|1 +
>>  gcc/gimple-switch-low.c| 1226 
>> 
>>  gcc/passes.def |1 +
>>  gcc/stmt.c |  861 -
>>  gcc/testsuite/gcc.dg/tree-prof/update-loopch.c |   10 +-
>>  gcc/testsuite/gcc.dg/tree-ssa/vrp104.c |2 +-
>>  gcc/timevar.def|1 +
>>  gcc/tree-pass.h|1 +
>>  8 files changed, 1236 insertions(+), 867 deletions(-)
>>  create mode 100644 gcc/gimple-switch-low.c
>>
>>

Re: [PATCH] Fix PR middle-end/81564: ICE in group_case_labels_stmt()

2017-07-27 Thread Steven Bosscher

On Wed, Jul 26, 2017 at 9:35 PM, Peter Bergner wrote:
> The test case for PR81564 exposes an issue where the case labels for a
> switch statement point to blocks that have already been removed by an
> earlier call to cleanup_tree_cfg().  In that case, the code in
> group_case_labels_stmt() that does:

How can a basic block be removed (apparently as unreachable) if there
are still case labels leading to it?

Apparently there is enough information to make CASE_LABEL be set to
NULL. Why is the case label not just removed (or redirected to the
default, or ...)?

The patch feels like it's papering over another issue.
group_case_labels is an optional thing to do, basically just a
simplification. The compiler should run even if you never group the
case labels...

Ciao!
Steven

Re: [PATCH][PR 59521] Respect probabilities when expanding switch statement

2017-07-20 Thread Steven Bosscher

On Tue, Jul 18, 2017 at 9:04 AM, Yuri Gribov wrote:
> Hi all,
>
> Currently all cases in switch statement are treated as having equal
> probabilities which causes suboptimal code as demonstrated in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59521 . This patch
> modifies expander to select pivot point for decision tree so that
> probabilities of cases on the left are roughly equal to probabilities
> on the right.
>
> Patch survives bootstrap and regtesting on x64 but has some issues:
> * tests are fragile but I'm not sure how to make them better
> * I haven't done any performance measurements - would these be needed?
> I don't have access to SPEC these days, any other suggestions?
>
> Patch is jointly authored with Martin.

Hi Yuri,

Can you come up with test cases that don't scan the assembly output?
Ideally the test case should check a dump file that is as close as
possible to the code transformation, in this case the dump from
pass_expand.

Ciao!
Steven

Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)

2017-04-29 Thread Steven Bosscher

On Wed, Apr 26, 2017 at 1:25 PM, Jakub Jelinek wrote:
> Or shall we just:
> --- gcc/alias.c 2017-04-25 15:51:31.072923325 +0200
> +++ gcc/alias.c 2017-04-26 13:23:55.595048464 +0200
> @@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
>  {
>if (!INSN_P (insn))
>  return false;
> +  if (CALL_P (insn))
> +return true;

+"&& ! RTL_CONST_OR_PURE_CALL_P (insn)" ?

Ciao!
Steven




>memory_modified = false;
>note_stores (PATTERN (insn), memory_modified_1, CONST_CAST_RTX(mem));
>return memory_modified;
>
> instead of the call_crossed hacks?  Then modified_between_p and
> modified_in_p would return true for !MEM_READONLY_P MEMs crossing a call.
>
> Jakub

Re: [PATCH] Reenable RTL sharing verification

2016-11-30 Thread Steven Bosscher

On Wed, Nov 30, 2016 at 1:08 PM, Jakub Jelinek wrote:
> Hi!
>
> The http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01055.html
> change broke all RTL sharing verification, even with --enable-checking=rtl
> we don't verify anything for the last 3.5 years.

Eh, I guess "oops!" doesn't quite cover that error. Sorry!

Ciao!
Steven

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-11-02 Thread Steven Bosscher

 On Wed, Nov 2, 2016 at 10:02 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Mon, Oct 31, 2016 at 4:35 PM, Segher Boessenkool wrote:
>> On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote:
>>> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
>>> > +  cfg_layout_finalize ();
>>>
>>> I don't think you have to go into/out-of cfglayout mode for each iteration.
>>
>> Yeah probably.  I was too lazy :-)  It needs the cleanup_cfg every
>> iteration though, I was afraid that interacts.
>
> Ick.  Why would it need a cfg-cleanup every iteration?

I don't believe it needs a cleanup on every iteration. One cleanup at
the end should work fine.

Ciao!
Steven

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-10-31 Thread Steven Bosscher

On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
> This patch solves this problem by simply running the duplicate_computed_gotos
> pass again, as long as it does any work.  The patch looks much bigger than
> it is, because I factored out two routines to simplify the control flow.

It's made the patch a bit difficult to read. Condensing it a bit...

> +  for (;;)
>  {
> +  if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
> +   return 0;

This test should not be needed in the loop. This pass can never
collapse the function to a single basic block.


> +  clear_bb_flags ();
> +  cfg_layout_initialize (0);

See comment below...


> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, fun)
> +   {
> + /* Build the reorder chain for the original order of blocks.  */
> + if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
> +   bb->aux = bb->next_bb;
> +   }
>
> +  duplicate_computed_gotos_find_candidates (fun, candidates, max_size);
>
> +  bool changed = false;
> +  if (!bitmap_empty_p (candidates))
> +   changed = duplicate_computed_gotos_do_duplicate (fun, candidates);
>
> +  if (changed)
> +   fixup_partitions ();
> +
> +  cfg_layout_finalize ();

I don't think you have to go into/out-of cfglayout mode for each iteration.


>/* Merge the duplicated blocks into predecessors, when possible.  */
> +  if (changed)
> +   cleanup_cfg (0);
> +  else
> +   break;
>  }

Maybe a gcc_assert that the loop doesn't iterate more often than num_edges?

Ciao!
Steven

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-18 Thread Steven Bosscher

On Thu, Oct 13, 2016 at 5:16 PM, Janne Blomqvist wrote:
> +static bool *newunits;

You could make this a bitmap (like sbitmap). A bit more code but makes
a potentially quadratic search (when opening many units) less time
consuming.

Ciao!
Steven

Re: [PATCH] df: Keep return address register undefined until epilogue_completed

2016-08-29 Thread Steven Bosscher

On Mon, Aug 29, 2016 at 6:50 PM, Segher Boessenkool wrote:
> This patch changes that so that that def is only added after
> epilogue_completed.
...
> Does this work on all other targets?

Guessing: not for targets where INCOMING_RETURN_ADDR_RTX is the stack register?
That'd be at least h8300, rl78, and rx.

Ciao!
Steven

Re: Init df for split pass since some target use REG_NOTE in split pattern

2016-08-09 Thread Steven Bosscher

On Mon, Aug 8, 2016 at 9:45 PM, Jeff Law wrote:

>
> I'm pretty sure we do _not_ want it for split_all_insns_noflow since that's
> used when the CFG is not necessarily correct and thus I don't see how we can
> reliably have death/unused notes.

Actually, trying to initializing DF without a valid CFG will most
likely just crash the compiler. DF doesn't work without a valid CFG.

I expect the patch to fail for at least MIPS, SPARC, and ARM.

Ciao!
Steven

Re: [PATCH] Add code-hoisting to GIMPLE

2016-07-04 Thread Steven Bosscher

On Mon, Jul 4, 2016 at 1:26 PM, Richard Biener wrote:
>
> The following patch is Stevens code-hoisting based on PRE forward-ported
> and fixed for bootstrap plus the case of hoisting code across loops
> which we generally do not want (expressions in the loop exit target block
> are antic-in throughout the whole loop unless they are killed and thus
> get inserted into the exit block and then PREd before the loop).
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> I'm going to try making the bitmap_set ops in do_hoist_insert a bit
> faster - Steven, do you remember any issues with the approach from the
> time you worked on it?

Hi Richi,

It's been almost 8 years since I worked on this, so I really don't
recall much about this at all. Sorry :-)

But thank you for picking this old work up!

Ciao!
Steven

Re: [PATCH 1/3] cfgcleanup: Bugfix in try_simplify_condjump

2016-05-03 Thread Steven Bosscher

On Tue, May 3, 2016 at 8:59 AM, Segher Boessenkool wrote:
> If the jump_block here contains just a return, we will crash later
> in invert_jump.  Don't allow that case.

But if jump_block contains a return, then it isn't the EXIT_BLOCK for
the function.
Is the conditional jump a conditional return insn?

Ciao!
Steven

[patch] cleanups for vtable-verify

2016-05-01 Thread Steven Bosscher

Hello,

This patch is random cleanups on the vtable-verify code.
OK for trunk?

Ciao!
Steven

gcc/
  * vtable-verify.h (verify_vtbl_ptr_fndecl): Add GTY markers.
  (num_vtable_map_nodes): Remove extern declaration.
  (vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
  * vtable-verify.c (num_vtable_map_nodes): Make static.
  (vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
  (verify_vtbl_ptr_fndecl): Remove redundant extern declaration.

cp/
  * vtable-class-hierarchy.c (vtable_find_or_create_map_decl):
  Make static.
  (vtv_compute_class_hierarchy_transitive_closure): Eliminate uses of
  num_vtable_map_nodes in lieu of vtbl_map_nodes_vec.length() and of
  vtbl_map_nodes_vec.iterate().
  (guess_num_vtable_pointers, register_all_pairs,
  write_out_vtv_count_data, vtv_register_class_hierarchy_information,
  vtv_generate_init_routine): Likewise.
gcc/
* vtable-verify.h (verify_vtbl_ptr_fndecl): Add GTY markers.
(num_vtable_map_nodes): Remove extern declaration.
(vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
* vtable-verify.c (num_vtable_map_nodes): Make static.
(vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
(verify_vtbl_ptr_fndecl): Remove redundant extern declaration.

cp/
* vtable-class-hierarchy.c (vtable_find_or_create_map_decl):
Make static.
(vtv_compute_class_hierarchy_transitive_closure): Eliminate uses of
num_vtable_map_nodes in lieu of vtbl_map_nodes_vec.length() and of
vtbl_map_nodes_vec.iterate().
(guess_num_vtable_pointers, register_all_pairs,
write_out_vtv_count_data, vtv_register_class_hierarchy_information,
vtv_generate_init_routine): Likewise.


Index: vtable-verify.h
===
--- vtable-verify.h (revision 235623)
+++ vtable-verify.h (working copy)
@@ -27,12 +27,8 @@ along with GCC; see the file COPYING3.  If not see
be global because it needs to be initialized in the C++ front end, but
used in the middle end (in the vtable verification pass).  */
 
-extern tree verify_vtbl_ptr_fndecl;
+extern GTY(()) tree verify_vtbl_ptr_fndecl;
 
-/* Global variable keeping track of how many vtable map variables we
-   have created. */
-extern unsigned num_vtable_map_nodes;
-
 /* Keep track of how many virtual calls we are actually verifying.  */
 extern int total_num_virtual_calls;
 extern int total_num_verified_vcalls;
@@ -127,10 +123,6 @@ extern bool vtv_debug;
 /* The global vector of vtbl_map_nodes.  */
 extern vec vtbl_map_nodes_vec;
 
-/*  The global vectors for mangled class names for anonymous classes.  */
-extern GTY(()) vec *vtbl_mangled_name_types;
-extern GTY(()) vec *vtbl_mangled_name_ids;
-
 extern void vtbl_register_mangled_name (tree, tree);
 extern struct vtbl_map_node *vtbl_map_get_node (tree);
 extern struct vtbl_map_node *find_or_create_vtbl_map_node (tree);
Index: vtable-verify.c
===
--- vtable-verify.c (revision 235623)
+++ vtable-verify.c (working copy)
@@ -144,11 +144,13 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "vtable-verify.h"
 
-unsigned num_vtable_map_nodes = 0;
+/* Global variable keeping track of how many vtable map variables we
+   have created. */
+static unsigned num_vtable_map_nodes = 0;
+
 int total_num_virtual_calls = 0;
 int total_num_verified_vcalls = 0;
 
-extern GTY(()) tree verify_vtbl_ptr_fndecl;
 tree verify_vtbl_ptr_fndecl = NULL_TREE;
 
 /* Keep track of whether or not any virtual call were verified.  */
@@ -305,10 +307,8 @@ static vtbl_map_table_type *vtbl_map_hash;
 vec vtbl_map_nodes_vec;
 
 /* Vector of mangled names for anonymous classes.  */
-extern GTY(()) vec *vtbl_mangled_name_types;
-extern GTY(()) vec *vtbl_mangled_name_ids;
-vec *vtbl_mangled_name_types;
-vec *vtbl_mangled_name_ids;
+static GTY(()) vec *vtbl_mangled_name_types;
+static GTY(()) vec *vtbl_mangled_name_ids;
 
 /* Look up class_type (a type decl for record types) in the 
vtbl_mangled_names_*
vectors.  This is a linear lookup.  Return the associated mangled name for
Index: cp/vtable-class-hierarchy.c
===
--- cp/vtable-class-hierarchy.c (revision 235623)
+++ cp/vtable-class-hierarchy.c (working copy)
@@ -137,8 +137,6 @@ struct work_node {
   struct work_node *next;
 };
 
-struct vtbl_map_node *vtable_find_or_create_map_decl (tree);
-
 /* As part of vtable verification the compiler generates and inserts
calls to __VLTVerifyVtablePointer, which is in libstdc++.  This
function builds and initializes the function decl that is used
@@ -380,7 +378,7 @@ void
 vtv_compute_class_hierarchy_transitive_closure (void)
 {
   struct work_node

Re: [PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570)

2016-02-01 Thread Steven Bosscher

On Mon, Feb 1, 2016 at 9:32 AM, Jakub Jelinek  wrote:
> Bootstrapped/regtested on 
> {x86_64,i686,ppc64,ppc64le,s390,s390x,aarch64}-linux,
> ok for trunk?

OK.

Browny points for opting out of the loop over all insns in the basic
block when count > limit.

Ciao!
Steven

Re: Try to update dominance info in tree-call-cdce.c

2015-10-31 Thread Steven Bosscher

On Fri, Oct 30, 2015 at 2:11 PM, Richard Sandiford
 wrote:
> Is the split_block change really so bad?

IMHO: Yes.

split_block just splits some basic block B into two blocks B1/B2
somewhere in the middle of B. The dominance relations between B1 and
B2 are obvious and intuitive. The new flag to split_block is not. The
dominance info is not updated but the loops are? Confusing, if you ask
me...

Your situation, where a pass knows the dominance relations will
change, is unusual. The extra work to fix up the ET tree twice should
be negligible anyway.

Ciao!
Steven

Re: [PATCH 1/4] bb-reorder: Split out STC

2015-09-24 Thread Steven Bosscher

On Thu, Sep 24, 2015 at 12:06 AM, Segher Boessenkool wrote:
> This first patch simply factors code a little bit.
>
>
> 2015-09-23   Segher Boessenkool  
>
> * bb-reorder.c (reorder_basic_blocks_software_trace_cache): New
> function, factored out from ...
> (reorder_basic_blocks): ... here.

OK.

Ciao!
Steven

Re: [PATCH 2/4] bb-reorder: Add the "simple" algorithm

2015-09-24 Thread Steven Bosscher

On Thu, Sep 24, 2015 at 12:06 AM, Segher Boessenkool wrote:
> +  /* First, collect all edges that can be optimized by reordering blocks:
> + simple jumps and conditional jumps, as well as the function entry edge. 
>  */
> +
> +  int n = 0;
> +  edges[n++] = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
> +
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, cfun)
> +{
> +  rtx_insn *end = BB_END (bb);
> +
> +  if (computed_jump_p (end) || tablejump_p (end, NULL, NULL))
> +   continue;

Should handle ASM jumps.

> +  FOR_ALL_BB_FN (bb, cfun)
> +bb->aux = bb;

Bit tricky for the ENTRY and EXIT blocks, that are not really basic
blocks. After the pass, EXIT should not end up pointing to itself.
Maybe use FOR_EACH_BB_FN and set ENTRY separately?

Other than that, looks good to me.

Ciao!
Steven

Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-08-09 Thread Steven Bosscher

On Fri, Jul 31, 2015 at 7:26 PM, Jeff Law l...@redhat.com wrote:

 So there's a tight relationship between the implementation of
 bbs_ok_for_cmove_arith and insn_valid_noce_process_p.  If there wasn't, then
 we'd probably be looking to use note_stores and note_uses.

Perhaps I'm missing something, but what is wrong with using DF here
instead of note_stores/note_uses? All the info on refs/mods of
registers is available in the DF caches.

Ciao!
Steven

Re: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-19 Thread Steven Bosscher

On Tue, May 19, 2015 at 12:17 PM, Thomas Preud'homme wrote:
 2015-05-18  Thomas Preud'homme

 PR rtl-optimization/66168
 * loop-invariant.c (move_invariant_reg): Set inv-reg to destination
 of inv-insn when moving an invariant without introducing a temporary
 register.

Not OK.
This will break in move_invariants() when it looks at REGNO (inv-reg).

Ciao!
Steven

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-23 Thread Steven Bosscher

On Thu, Apr 16, 2015 at 10:43 AM, Thomas Preud'homme wrote:
 2015-04-15  Thomas Preud'homme  thomas.preudho...@arm.com
 Steven Bosscher stevenb@gmail.com

 * cprop.c (cprop_reg_p): New.
 (hash_scan_set): Use above function to check if register can be
 propagated.
 (find_avail_set): Return up to two sets, one whose source is
 a register and one whose source is a constant.  Sets are returned in
 an array passed as parameter rather than as a return value.
 (cprop_insn): Use a do while loop rather than a goto.  Try each of the
 sets returned by find_avail_set, starting with the one whose source is
 a constant. Use cprop_reg_p to check if register can be propagated.
 (do_local_cprop): Use cprop_reg_p to check if register can be
 propagated.
 (implicit_set_cond_p): Likewise.


I wouldn't usually approve patches I've coded bits in myself. But this
post is now 7 days old and it's Thomas' patch for 99%, so...

OK for trunk.

Can you please put steven at gcc.gnu.org for my e-mail address in the
ChangeLog entry?

Ciao!
Steven

Re: [PR65768] Check rtx_cost when propagating constant

2015-04-15 Thread Steven Bosscher

On Wed, Apr 15, 2015 at 9:53 AM, Kugan wrote:
 2015-04-15  Kugan Vivekanandarajah   
 Zhenqiang Chen  

 PR target/65768
 * cprop.c (try_replace_reg): Check cost of constants before 
 propagating.


 +
 +  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
 + And it can be shared by different references.  So skip propagation if
 + it makes INSN's rtx cost higher.  */
 +

So only undo if the insn is inside a loop (i.e.
BLOCK_FOR_INSN(insn)-loop_father != NULL) and this is a
post-pass_loop2 cprop run?

Ciao!
Steven

Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-04-14 Thread Steven Bosscher

On Tue, Apr 14, 2015 at 5:06 PM, Jiong Wang wrote:
 but, after some investigation I found gcc actually generate difference
 code when debug info enabled because
 DEBUG_INSN will affect data flow analysis.

It should not, so that's a bug.


 So I think this stage2/3 binary difference is acceptable?

No, they should be identical. If there's a difference, then there's a
bug - which, it seems, you've already found, too.

Ciao!
Steven

Re: [PATCH] Fix rtl_split_edge (PR rtl-optimization/65761)

2015-04-14 Thread Steven Bosscher

On Tue, Apr 14, 2015 at 2:58 PM, Jakub Jelinek wrote:
 PR rtl-optimization/65761
 * cfgrtl.c (rtl_split_edge): For EDGE_CROSSING split, use
 get_last_bb_insn (after) instead of NEXT_INSN (BB_END (after)).

 --- gcc/cfgrtl.c.jj 2015-04-12 21:50:18.0 +0200
 +++ gcc/cfgrtl.c2015-04-14 12:45:34.127722958 +0200
 @@ -1928,7 +1928,7 @@ rtl_split_edge (edge edge_in)
 (edge_in-flags  EDGE_CROSSING))
  {
after = last_bb_in_partition (edge_in-src);
 -  before = NEXT_INSN (BB_END (after));
 +  before = get_last_bb_insn (after);
/* The instruction following the last bb in partition should
   be a barrier, since it cannot end in a fall-through.  */
gcc_checking_assert (BARRIER_P (before));

This is OK.

Ciao!
Steven

Re: breakage with [PATCH] combine: Disregard clobbers in another test for two SETs (PR65693)

2015-04-09 Thread Steven Bosscher

On Thu, Apr 9, 2015 at 2:41 PM, Segher Boessenkool wrote:
 It would be nice if there would be some cc0 target in the compile farm,
 since cc0 isn't going away any time soon :-(

In this case it would be enough to replace the #ifndef/#ifdef
HAVE_cc0 code with if (HAVE_cc0).

That's the simplest way to avoid compile breakage. Likewise for so
many other #ifdef code (HAVE_conditional_move, HAVE_trap, etc.).

Perhaps something to work on in the next stage1...

Ciao!
Steven

Re: [PATCH] Disable ppc/spu context sensitive macros for CLK_ASM preprocessing (PR preprocessor/61977)

2015-04-01 Thread Steven Bosscher

On Wed, Apr 1, 2015 at 10:40 AM, Jakub Jelinek wrote:
 Hi!

 The context sensitive macros are inherently C/C++ specific, so trying to
 expand them into anything when preprocessing assembler doesn't make any
 sense to me.

Why are the arch-c.c cpp builtins defined at all when preprocessing
assembly? Or in other words: should these (supposedly)
language-dependent hooks for cpp builtins be called if the
pre-processor is called stand-alone?

Ciao!
Steven

Re: [PR64164] drop copyrename, integrate into expand

2015-03-31 Thread Steven Bosscher

On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva wrote:
 Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
 on x86_64, so without lto plugin.  The only regression is in
 gcc.dg/guality/pr54200.c, that explicitly disables VTA.

What about memory footprint? IIRC this pass was in part introduced to
reduce the number of VAR_DECLs.

Ciao!
Steven

Re: [PATCH 3/3] Fix dbr_schedule for -freorder-blocks-and-partition

2015-03-24 Thread Steven Bosscher

On Tue, Jan 27, 2015 at 12:52 AM, Kaz Kojima wrote:
 This patch is to fix 2 issues found in dbr_schedule when trying to
 fix PR target/64761.  The first is relax_delay_slots removes
 the jump insn in the insns like below:

 (jump_insn/j 74 58 59 (set (pc) (label_ref:SI 29)) ...)
 (barrier 59 74 105)
 (note 105 59 29 NOTE_INSN_SWITCH_TEXT_SECTIONS)
 (code_label 29 105 30 31  [5 uses])
 (insn 31 30 32 (set (reg ...

 i.e. relax_delay_slot tries to delete the jump insn pointing to
 the next active insn of that jump insn as a trivial jump even when
 there is a NOTE_INSN_SWITCH_TEXT_SECTIONS note between that jump
 and its next active insn.
 The second issue is that relax_delay_slots does a variant of
 follow jump optimization without checking targetm.can_follow_jump.

 --
 PR target/64761
 * reorg.c (switch_text_sections_between_p): New function.
 (relax_delay_slots): Call it when testing if the jump insn
 is removable.  Use targetm.can_follow_jump when testing if
 the conditional branch can follow an unconditional jump.


This patch merely papers over another issue, probably a missing
CROSSING_JUMP_P test.

Ciao!
Steven





 diff --git a/reorg.c b/reorg.c
 index 326fa53..2387910 100644
 --- a/reorg.c
 +++ b/reorg.c
 @@ -3211,6 +3211,19 @@ label_before_next_insn (rtx x, rtx scan_limit)
return insn;
  }

 +/* Return TRUE if there is a NOTE_INSN_SWITCH_TEXT_SECTIONS note in between
 +   BEG and END.  */
 +
 +static bool
 +switch_text_sections_between_p (const rtx_insn *beg, const rtx_insn *end)
 +{
 +  const rtx_insn *p;
 +  for (p = beg; p != end; p = NEXT_INSN (p))
 +if (NOTE_P (p)  NOTE_KIND (p) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
 +  return true;
 +  return false;
 +}
 +

  /* Once we have tried two ways to fill a delay slot, make a pass over the
 code to try to improve the results and to do such things as more jump
 @@ -3247,7 +3260,8 @@ relax_delay_slots (rtx_insn *first)
 target_label = find_end_label (target_label);

   if (target_label  next_active_insn (target_label) == next
 -  ! condjump_in_parallel_p (insn))
 +  ! condjump_in_parallel_p (insn)
 +  ! (next  switch_text_sections_between_p (insn, next)))
 {
   delete_jump (insn);
   continue;
 @@ -3262,12 +3276,13 @@ relax_delay_slots (rtx_insn *first)

   /* See if this jump conditionally branches around an unconditional
  jump.  If so, invert this jump and point it to the target of the
 -second jump.  */
 +second jump.  Check if it's possible on the target.  */
   if (next  simplejump_or_return_p (next)
any_condjump_p (insn)
target_label
next_active_insn (target_label) == next_active_insn (next)
 -  no_labels_between_p (insn, next))
 +  no_labels_between_p (insn, next)
 +  targetm.can_follow_jump (insn, next))
 {
   rtx label = JUMP_LABEL (next);

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-23 Thread Steven Bosscher

On Mon, Mar 23, 2015 at 12:01 PM, Thomas Preud'homme wrote:
 What about the cprop_reg_p that needs to be negated? Did I miss something
 that makes it ok?

You didn't miss anything.  I sent the wrong patch. The one I tested on
ppc64 also has the condition reversed:

@@ -1328,9 +1329,8 @@ implicit_set_cond_p (const_rtx cond)
   if (GET_CODE (cond) != EQ  GET_CODE (cond) != NE)
 return false;

-  /* The first operand of COND must be a pseudo-reg.  */
-  if (! REG_P (XEXP (cond, 0))
-  || HARD_REGISTER_P (XEXP (cond, 0)))
+  /* The first operand of COND must be a register we can propagate.  */
+  if (! cprop_reg_p (XEXP (cond, 0)))
 return false;

   /* The second operand of COND must be a suitable constant.  */

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-20 Thread Steven Bosscher

On Tue, Feb 17, 2015 at 3:51 AM, Thomas Preud'homme wrote:
  -  else if (REG_P (src)
  -   REGNO (src) = FIRST_PSEUDO_REGISTER
  -   REGNO (src) != regno)
  -   {
  - if (try_replace_reg (reg_used, src, insn))
  + else if (src_reg  REG_P (src_reg)
  +   REGNO (src_reg) = FIRST_PSEUDO_REGISTER
  +   REGNO (src_reg) != regno
  +   try_replace_reg (reg_used, src_reg, insn))

 Likewise for the REG_P and = FIRST_PSEUDO_REGISTER tests here
 (with
 the equivalent and IMHO preferable HARD_REGISTER_P test in
 find_avail_set()).

 I'm not sure I follow you here. First, it seems to me that the equivalent
 test is rather REG_P  !HARD_REGISTER_P since here it checks if it's
 a pseudo register.

 Then, do you mean the test can be simply removed because of the
 REG_P  !HARD_REGISTER_P in hash_scan_set () called indirectly by
 compute_hash_table () when called in one_cprop_pass () before any
 cprop_insn ()? Or do you mean I should move the check in
 find_avail_set ()?


What I meant, is that I believe the tests are already done in
hash_scan_set and should be redundant in cprop_insn (i.e. the test can
be replaced with gcc_[checking_]assert).

I've attached a patch with some changes to it: introduce cprop_reg_p()
to get rid of all the REG_P  regno  FIRST_PSEUDO_REGISTER tests.
I still have the cprop_constant_p and cprop_reg_p tests in cprop_insn
but this weekend I'll try with gcc_checking_asserts instead. Please
have a look at the patch and let me know if you like it (given it's
mostly yours I hope you do like it ;-)

Bootstrapped  tested on powerpc64-unknown-linux-gnu. In building all
of cc1, 35 extra copies are propagated with the patch.

Ciao!
Steven
Index: cprop.c
===
--- cprop.c (revision 221523)
+++ cprop.c (working copy)
@@ -285,6 +285,15 @@ cprop_constant_p (const_rtx x)
   return CONSTANT_P (x)  (GET_CODE (x) != CONST || shared_const_p (x));
 }
 
+/* Determine whether the rtx X should be treated as a register that can
+   be propagated.  Any pseudo-register is fine.  */
+
+static bool
+cprop_reg_p (const_rtx x)
+{
+  return REG_P (x)  !HARD_REGISTER_P (x);
+}
+
 /* Scan SET present in INSN and add an entry to the hash TABLE.
IMPLICIT is true if it's an implicit set, false otherwise.  */
 
@@ -295,8 +304,7 @@ hash_scan_set (rtx set, rtx_insn *insn, struct has
   rtx src = SET_SRC (set);
   rtx dest = SET_DEST (set);
 
-  if (REG_P (dest)
-   ! HARD_REGISTER_P (dest)
+  if (cprop_reg_p (dest)
reg_available_p (dest, insn)
can_copy_p (GET_MODE (dest)))
 {
@@ -321,9 +329,8 @@ hash_scan_set (rtx set, rtx_insn *insn, struct has
src = XEXP (note, 0), set = gen_rtx_SET (VOIDmode, dest, src);
 
   /* Record sets for constant/copy propagation.  */
-  if ((REG_P (src)
+  if ((cprop_reg_p (src)
src != dest
-   ! HARD_REGISTER_P (src)
reg_available_p (src, insn))
  || cprop_constant_p (src))
insert_set_in_table (dest, src, insn, table, implicit);
@@ -821,15 +828,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn)
   return success;
 }
 
-/* Find a set of REGNOs that are available on entry to INSN's block.  Return
-   NULL no such set is found.  */
+/* Find a set of REGNOs that are available on entry to INSN's block.  If found,
+   SET_RET[0] will be assigned a set with a register source and SET_RET[1] a
+   set with a constant source.  If not found the corresponding entry is set to
+   NULL.  */
 
-static struct cprop_expr *
-find_avail_set (int regno, rtx_insn *insn)
+static void
+find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2])
 {
-  /* SET1 contains the last set found that can be returned to the caller for
- use in a substitution.  */
-  struct cprop_expr *set1 = 0;
+  set_ret[0] = set_ret[1] = NULL;
 
   /* Loops are not possible here.  To get a loop we would need two sets
  available at the start of the block containing INSN.  i.e. we would
@@ -869,8 +876,10 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn)
  If the source operand changed, we may still use it for the next
  iteration of this loop, but we may not use it for substitutions.  */
 
-  if (cprop_constant_p (src) || reg_not_set_p (src, insn))
-   set1 = set;
+  if (cprop_constant_p (src))
+   set_ret[1] = set;
+  else if (reg_not_set_p (src, insn))
+   set_ret[0] = set;
 
   /* If the source of the set is anything except a register, then
 we have reached the end of the copy chain.  */
@@ -881,10 +890,6 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn)
 and see if we have an available copy into SRC.  */
   regno = REGNO (src);
 }
-
-  /* SET1 holds the last set that was available and anticipatable at
- INSN.  */
-  return set1;
 }
 
 /* Subroutine of cprop_insn that tries to propagate

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-20 Thread Steven Bosscher

On Fri, Mar 20, 2015 at 11:27 AM, Thomas Preud'homme wrote:
 Sorry, I missed the parenthesis. REG_P needs indeed to be kept. I'd be
 tempted to use !HARD_REGISTER_P instead since REG_P is already
 checked but I don't mind either way.

I put the cprop_reg_p check there instead of !HARD_REGISTER_P because
I like to be able to quickly find all places where a similar check is
performed. The check is whether the reg is something that copy
propagation can handle, and that is what I added cprop_reg_p for.
(Note that cprop can _currently_ handle only pseudos but there is no
reason why a limited set of hard regs can't be handled also, e.g. the
flag registers like in targetm.fixed_condition_code_regs).

In this case, the result is that REG_P is checked twice.
But then again, cprop_reg_p will be inlined and the double check optimized away.

Anyway, I guess we've bikeshedded long enough over this patch as it is
:-) Let's post a final form and declare it OK for stage1.

As for PSEUDO_REG_P: If it were up to me, I'd like to have in rtl.h:

static bool
hard_register_p (rtx x)
{
  return (REG_P (x)  HARD_REGISTER_NUM_P (REGNO (x)));
}

static bool
pseudo_register_p (rtx x)
{
  return (REG_P (x)  !HARD_REGISTER_NUM_P (REGNO (x)));
}

and do away with all the FIRST_PSEUDO_REGISTER tests. But I've
proposed this in the past and there was opposition. Perhaps when we
introduce a rtx_reg class...

Ciao!
Steven

Re: [PATCH][gcse] Use emit_move_insn rather than creating SET rtx and emitting that

2015-03-19 Thread Steven Bosscher

On Thu, Mar 19, 2015 at 2:45 PM, Kyrill Tkachov wrote:
 As pointed out by James Greenhalgh offline the correct thing would have been
 to do an
 emit_move_insn to let the backend expanders do the right thing (especially
 in the concerned
 testcase gcc.c-torture/execute/pr65427.c that uses 256-bit vectors that arm
 doesn't support
 natively).

This is supposed to be caught by want_to_gcse_p() via
can_assign_to_reg_without_clobbers_p(). How does your expression get
past that barrier?

The gcc_unreachable() is there because all the code in gcse.c assumes
it is OK to emit a SET-insn without going through emit_move_insn().

Ciao!
Steven

Re: [PATCH] Run DCE after if conversion

2015-03-10 Thread Steven Bosscher

On Tue, Mar 10, 2015 at 8:57 AM, Andreas Krebbel wrote:

 * gcc/ifcvt.c (if_convert):


...yes...?


 diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
 index a3e3e5c..d2040af 100644
 --- a/gcc/ifcvt.c
 +++ b/gcc/ifcvt.c
 @@ -4626,6 +4626,13 @@ if_convert (bool after_combine)
num_true_changes);
  }

 +  if (num_true_changes  0)
 +{
 +  df_set_flags (DF_LR_RUN_DCE);
 +  df_mark_solutions_dirty ();
 +  df_analyze ();
 +}
 +
   if (optimize == 1)
 df_remove_problem (df_live);

Tiny nail, huge hammer. This triggers a full re-scan of all insns and
a re-calculation of all dataflow problems.

The transformations in ifcvt are all simple enough that it should be
possible to just clean up that redundant insn at the site where the
code transformation is performed.

Ciao!
Steven

Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-09 Thread Steven Bosscher

On Thu, Mar 5, 2015 at 10:53 AM, Thomas Preud'homme wrote:
 diff --git a/gcc/dominance.c b/gcc/dominance.c
 index 33d4ae4..09c8c90 100644
 --- a/gcc/dominance.c
 +++ b/gcc/dominance.c
 @@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum cdi_direction dir, 
 bitmap blocks)

 A_Dominated_by_B (node A, node B)
 {
 - return DFS_Number_In(A) = DFS_Number_In(A)
 + return DFS_Number_In(A) = DFS_Number_In(B)
   DFS_Number_Out (A) = DFS_Number_Out(B);
 }  */

This hunk is obvious enough ;-)


 diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
 index f79b497..ab2a45c 100644
 --- a/gcc/loop-invariant.c
 +++ b/gcc/loop-invariant.c
...
 +  /* Check whether the set is always executed.  We could omit this condition 
 if
 + we know that the register is unused outside of the loop, but it does not
 + seem worth finding out.  */
 +  may_exit = BITMAP_ALLOC (NULL);
 +  has_exit = BITMAP_ALLOC (NULL);
 +  always_executed = BITMAP_ALLOC (NULL);
 +  body = get_loop_body_in_dom_order (loop);
 +  find_exits (loop, body, may_exit, has_exit);
 +  compute_always_reached (loop, body, has_exit, always_executed);
 +  /* Find bit position for basic block bb.  */
 +  for (i = 0; i  loop-num_nodes  body[i] != bb; i++);
 +  if (!bitmap_bit_p (always_executed, i))
 +goto cleanup;

It looks like this would run for all candidate loop invariants, right?

If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
potential compile time hog for large loops.

But why compute this at all? Perhaps I'm missing something, but you
already have inv-always_executed available, no?




 +  /* Check that all uses reached by the def in insn would still be reached
 + it.  */
 +  dest_regno = REGNO (reg);
 +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
 +{
 +  rtx ref;

Would be nice if we can start using rtx_insn for new code. You do so
further up, I suggest you continue that good citizenship here :-)


 +  basic_block use_bb;
 +
 +  ref = DF_REF_INSN (use);
 +  use_bb = BLOCK_FOR_INSN (ref);

You can use DF_REF_BB.



 +  /* Ignore instruction considered for moving.  */
 +  if (ref == insn)
 +   continue;
 +
 +  /* Don't consider uses outside loop.  */
 +  if (!flow_bb_inside_loop_p (loop, use_bb))
 +   continue;
 +
 +  /* Don't move if a use is not dominated by def in insn.  */
 +  if (use_bb == bb  DF_INSN_LUID (insn)  DF_INSN_LUID (ref))
 +   goto cleanup;

You're safer with = even if you've already checked ref==insn.


Ciao!
Steven

Re: [PATCH] Improve PR44563

2015-03-09 Thread Steven Bosscher

On Mon, Mar 9, 2015 at 4:12 PM, Richard Biener wrote:
 !   /* This is a really poor hash function, but it is what the current code 
 uses,
 !  so I am reusing it to avoid an additional axis in testing.  */

This is a bit silly as a comment, because after your patch the
current code is the patched code. Better to reference the
htab_hash_pointer function.

But can't we add an inline version in hashtab.h instead?

Ciao!
Steven

Re: [PATCH] [RTL] Relax CSE check to set REG_EQUAL notes.

2015-03-09 Thread Steven Bosscher

On Wed, Mar 4, 2015 at 12:09 PM, Alex Velenko wrote:
 For example, in arm testcase pr43920-2.c, CSE previously decided not to put
 an obvious note on insn 9, as set value was the same as note value.
 At the same time, other insns set up as -1 were set up through a register
 and did get a note:

...which is the point of the REG_EQUAL notes. In insn 8 there is a
REG_EQUAL note to show that the value of r111 is known. In insn 9 the
known value is, well, known from SET_SRC so there is no need for a
REG_EQUAL note. Adding REG_EQUAL notes in such cases is just wasteful.


 (insn 9 53 34 8 (set (reg:SI 110 [ D.4934 ])
 (const_int -1 [0x])) 
 /work/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:21 613 
 {*thumb2_movsi_vfp}
  (nil))

 (insn 8 45 50 6 (set (reg:SI 110 [ D.4934 ])
 (reg/v:SI 111 [ startD.4917 ])) 
 /work/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:21 613 
 {*thumb2_movsi_vfp}
  (expr_list:REG_EQUAL (const_int -1 [0x])
 (nil)))

 (insn 6 49 54 7 (set (reg:SI 110 [ D.4934 ])
 (reg/v:SI 112 [ endD.4918 ])) 
 /work/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:21 613 
 {*thumb2_movsi_vfp}
  (expr_list:REG_EQUAL (const_int -1 [0x])
 (nil)))

 Jump2 pass, optimizing common code, was looking at notes to reason about
 register values and failing to recognize those insns to be equal.

I suppose you are talking about the head/tail merging code? Can you
please provide a test case for problem preferably filed in Bugzilla)?

Ciao!
Steven

Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.

2015-03-02 Thread Steven Bosscher

On Sat, Feb 28, 2015 at 5:38 PM, Georg-Johann Lay wrote:
 Am 02/26/2015 um 11:45 PM schrieb Steven Bosscher:

 On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote:

 Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to
 rectify notes.  The pass is scheduled right before cfg does down (right
 before .*free_cfg) so that cfg and hence df machinery is available.

 Regression tests look fine and for the test case the patches produce
 correct
 code and correct insn length.


 Sorry for party-pooping, but it seems to me that the real bug is that
 the target is lying to reload.

 IIUC the AVR port selects the insn alternative after register
 allocation (after reload). It bases its selection on REG_DEAD notes.
 In PR64331 an alternative is used that clobbers r20 that has a
 REG_DEAD note, but r20 is not actually dead because hardreg-cprop has
 propagated it forward without adjusting the note.


 It's not actually about constraint alternatives.

 Let me give an example: Testing HI for 0. The usual sequence would be

   cc0 = reg.low == 0
   cc0 = cc0  reg.high == 0

 which costs 2 instructions.  If reg is unused after, then ORing can be used
 and cc0 will be set as a side effect.  This costs 1 insn:

   cc0 = (reg.low |= reg.high)

 Using alternatives would double their number, i.e. 14 instead of 7 for
 *cmphi.

 These constraint alternatives would have to express

 1) reg-alloc, please, use alternative #1 (with clobber of reg)
if the register is unused after

 2) reg-alloc, please, use alternative #2 (not clobbering reg)
if the register is used after

 If we assume for a moment that we have a *testhi insn and the old *tsthi had
 just 1 alternative:


 (define_insn *tsthi
   [(set (cc0)
 (compare
  (match_operand:ALL2 0 register_operand  r)
  (const_int 0)))]
  ...)


 Then the new one would be something like


 (define_insn *tsthi
   [(set (cc0)
 (compare
  (match_operand:ALL2 0 register_operand  r,r)
  (const_int 0)))
(clobber (match_scratch:HI 1   =0,X)]
 ...)

 But how can I express 1) and 2) ?

I don't think you can. Other ports express such transformations using
define_peephole2 and peep2_reg_dead_p.


 Currently, my preferred approach is a new drop-in replacement for the old
 reg_unused_after which uses clobbers to decide whether or not op 0 is still
 needed.  That way, reg-alloc can work like before and there is no need to
 implement dozens of new constraint alternatives across the md files.

The problem with this approach is that it may break at random. There's
just no guarantee that it will work, because you're relying on
information that just isn't valid anymore.

Unfortunately I don't know enough about CC0-targets to suggest an alternative.

Ciao!
Steven

Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.

2015-02-26 Thread Steven Bosscher

On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote:
 Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to
 rectify notes.  The pass is scheduled right before cfg does down (right
 before .*free_cfg) so that cfg and hence df machinery is available.

 Regression tests look fine and for the test case the patches produce correct
 code and correct insn length.

Sorry for party-pooping, but it seems to me that the real bug is that
the target is lying to reload.

IIUC the AVR port selects the insn alternative after register
allocation (after reload). It bases its selection on REG_DEAD notes.
In PR64331 an alternative is used that clobbers r20 that has a
REG_DEAD note, but r20 is not actually dead because hardreg-cprop has
propagated it forward without adjusting the note.

The normal way of things is that the insn alternative is selected in
reload (or in LRA) and that the clobbers are added as necessary. In
PR64331, an alternative for insn r17 would be selected that has a
CLOBBER for r20, prevent hardreg-cprop from propagating r20.

Selecting insns based on REG-notes is dangerous business. Lying to
reload and to post-RA passes is a mortal sin, the compiler will punish
you. There is no guarantee that nothing will change between your new
pass to recompute notes, and the final pass that emits the insns.

It's not my port, for sure, but I would look for a real fix instead:
Don't select insns to output based on unreliable information like
REG-notes.

Ciao!
Steven

Re: [PATCH] gcc/reload.c: Initialize several arrays before use them in find_reloads()

2015-02-23 Thread Steven Bosscher

On Mon, Feb 23, 2015 at 11:26 PM, Jeff Law wrote:
 On 02/22/15 02:02, Chen Gang S wrote:

 It is for Bug65117, after this fix, .i file can be passed compiling.

- 'this_alternative_win' is not initialized before use it: for the
  first looping 0, it initializes 'this_alternative_win[0]', but
  'did_match' may use 'this_alternative_win[2]'.

- 'this_alternative' may be not initialized before using: it
  initializes 'this_alternative[i]', but may use 'this_alternative[m]'
  (m  i).

- After reading through the code, arrays 'this_alternative_match_win',
  'this_alternative_offmemok', and 'this_alternative_earlyclobber' may
  be not initialized either, so initialize them too.

 This issue is found by cross compiling xtensa Linux kernel with the
 latest gcc5. And after this patch, it can cross compile xtensa Linux
 kernel with allmodconfig, successfully.


 2015-02-22  Chen Gang  gang.chen.5...@gmail.com

 * reload.c (find_reloads): Initialize several arrays before use
 them.


 From the documentation for matching constraints:

   Moreover, the digit must be a
   smaller number than the number of the operand that uses it in the
   constraint.


 If we look at the zero_cost_loop_{start,end} patterns we have:

 (if_then_else (ne (match_operand:SI 0 register_operand 2)

 and


 (if_then_else (ne (match_operand:SI 0 nonimmediate_operand 2,2)

 Similarly for the loop_end pattern.


 Which violate the rule for matching constraints.

...and should never have worked at all...


 I'm confident that if the xtensa's patterns were fixed to abide by the rules
 for matching constraints the problem in reload would not occur.


Chen, perhaps a warming can be implemented for this in genrecog?

Ciao!
Steven

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-02-16 Thread Steven Bosscher

On Mon, Feb 16, 2015 at 11:26 AM, Thomas Preud'homme wrote:

  /* Subroutine of cprop_insn that tries to propagate constants into
 @@ -1044,40 +1042,41 @@ cprop_insn (rtx_insn *insn)

 -  /* Constant propagation.  */
 -  if (cprop_constant_p (src))
 -   {
 -  if (constprop_register (reg_used, src, insn))
 + /* Constant propagation.  */
 + if (src_cst  cprop_constant_p (src_cst)
 +  constprop_register (reg_used, src_cst, insn))
 {
   changed_this_round = changed = 1;
   global_const_prop_count++;

The cprop_constant_p test is redundant, you only have non-NULL src_cst
if it is a cprop_constant_p (as you test for it in find_avail_set()).


 @@ -1087,18 +1086,16 @@ retry:
GLOBAL CONST-PROP: Replacing reg %d in , regno);
   fprintf (dump_file, insn %d with constant ,
INSN_UID (insn));
 - print_rtl (dump_file, src);
 + print_rtl (dump_file, src_cst);
   fprintf (dump_file, \n);
 }
   if (insn-deleted ())
 return 1;
 }
 -   }
 -  else if (REG_P (src)
 -   REGNO (src) = FIRST_PSEUDO_REGISTER
 -   REGNO (src) != regno)
 -   {
 - if (try_replace_reg (reg_used, src, insn))
 + else if (src_reg  REG_P (src_reg)
 +   REGNO (src_reg) = FIRST_PSEUDO_REGISTER
 +   REGNO (src_reg) != regno
 +   try_replace_reg (reg_used, src_reg, insn))

Likewise for the REG_P and = FIRST_PSEUDO_REGISTER tests here (with
the equivalent and IMHO preferable HARD_REGISTER_P test in
find_avail_set()).


Looks good to me otherwise.

Ciao!
Steven

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-02-16 Thread Steven Bosscher

On Mon, Feb 16, 2015 at 11:26 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 Hi,

 The RTL cprop pass in GCC operates by doing a local constant/copy propagation 
 first and then a global one. In the local one, if a constant cannot be 
 propagated (eg. due to constraints of the destination instruction) a copy 
 propagation is done instead. However, at the global level copy propagation is 
 only tried if no constant can be propagated, ie. if a constant can be 
 propagated but the constraints of the destination instruction forbids it, no 
 copy propagation will be tried. This patch fixes this issue. This solves the 
 redundant ldr problem in GCC32RM-439.


This would address https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34503#c4

I'll have a look at the patch tonight.

Ciao!
Seven

Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-26 Thread Steven Bosscher

On Mon, Jan 26, 2015 at 10:11 AM, Jakub Jelinek wrote:
 On Mon, Jan 26, 2015 at 10:06:05AM +0100, Eric Botcazou wrote:
  While the cleanup_barriers runs after cleaning up BLOCK_FOR_INSNs,
  some targets like i?86/x86_64 choose to populate it again during machine
  reorg and some target don't free it at the end of machine reorg.
  This patch updates cleanup_barrier pass, so that it adjusts basic block
  boundaries and BLOCK_FOR_INSNs in that case, so that we don't crash during
  final pass.

 This isn't a recent regression so what about fixing it more properly?  For
 example, by calling free_bb_for_insn at the end of the machinre reorg passes
 which called compute_bb_for_insn at the beginning?  Or do the affected ports
 need the BB info all the way down to final?

 Yes, they do, that is why it crashed during final.

And they should not do so. It cannot be relied upon, in general. Even
now, recomputing BLOCK_FOR_INSN only works because machine reorg is
the first pass after free_cfg (so nothing has changed yet to the insns
stream).

Some ports (including x86_64/i386, ia64, and most non-sched_dbr ports)
could simply do away with free_cfg and cleanup_barriers. But some
ports put things into the insns stream after free_cfg that
verify_flow_info chokes on, e.g. the fake insns for const pools for
ARM (that causes bugs elsewhere also, see e.g.
https://gcc.gnu.org/ml/gcc-patches/2011-07/msg02101.html). The current
situation is confusing and bound to give bugs sooner or later...

I had patches to have an early and late free_cfg pass -- I think I
even posted them and I still believe that's the right short-term
solution -- to make sure nobody can put something between free_cfg and
a target with a machine_reorg that needs the CFG, or at least
BLOCK_FOR_INSN.

Oh well...

Ciao!
Steven

Re: Housekeeping work in backends.html

2015-01-20 Thread Steven Bosscher

On Wed, Jan 7, 2015 at 9:42 AM, Eric Botcazou wrote:
 the attached patch removes obsolete ports (c4x, m68hc11 and ms1), toggles
 the 'p' letter and adjust accordingly (only avr, fr30, m68k, mcore, rs6000
 and sh still use define_peephole) and removes trailing spaces.

 Same treatment for the 'd' letter, the ports that do not use DFA scheduler
 descriptions are a clear minority (avr, cr16, cris, fr30, h8300, m32c, mmix,
 msp430, pdp11, stormy16, vax).  Applied.

Perhaps 'd' should just go away completely. It was intended to
distinguish between ports using the old scheduler description and
ports using the DFA model. But support for the old scheduler
description was removed some 10 years ago, and AFAIR the targets that
don't use the DFA scheduler don't use the scheduler at all.

Ciao!
Steven

Re: [PATCH 1/2] PR debug/63239 Add DWARF representation for C++11 deleted member function.

2014-10-06 Thread Steven Bosscher

On Mon, Oct 6, 2014 at 9:54 AM, Mark Wielaard wrote:
 Just removing the # prefix (but keeping the space) from
 scan-assembler-times should work for both our cases. I just don't know
 why our .s outputs look different.

Wild guess: different comment marker in the target's assembly language?

Ciao!
Steven

Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer

2014-09-30 Thread Steven Bosscher

On Tue, Sep 30, 2014 at 6:51 PM, Richard Earnshaw wrote:
 It's not just clobbers; it ignores patterns like

 (parallel
  [(set (a) (...)
   (set (b) (...)])
 [(reg_note (reg_unused(b))]

 Which is probably fine before register allocation but definitely
 something you have to think about afterwards.

Even before RA this isn't always fine. We have checks for
!multiple_sets for this.

Ciao!
Steven

Re: [PATCH] Fix signed integer overflow in gcc/data-streamer.c

2014-09-28 Thread Steven Bosscher

On Sun, Sep 28, 2014 at 2:22 PM, Markus Trippelsdorf wrote:
 Running the testsuite with a -fsanitize=undefined instrumented compiler
 shows:
  % gcc -O2 -flto -fno-use-linker-plugin -flto-partition=none 
 testsuite/gcc.dg/torture/pr28045.c
 gcc/data-streamer.c:113:45: runtime error: negation of -9223372036854775808 
 cannot be represented in type 'long int'; cast to an unsigned type to negate 
 this value to itself

 The fix is obvious.

 Boostrapped and tested on x86_64-unknown-linux-gnu.
 OK for trunk?
 Thanks.

 2014-09-28  Markus Trippelsdorf  mar...@trippelsdorf.de

 * data-streamer.c (bp_unpack_var_len_int): Avoid signed
 integer overflow.

 diff --git a/gcc/data-streamer.c b/gcc/data-streamer.c
 index 0e19c72162aa..0760ed590c22 100644
 --- a/gcc/data-streamer.c
 +++ b/gcc/data-streamer.c
 @@ -110,7 +110,7 @@ bp_unpack_var_len_int (struct bitpack_d *bp)
if ((half_byte  0x8) == 0)
 {
   if ((shift  HOST_BITS_PER_WIDE_INT)  (half_byte  0x4))
 -   result |= - ((HOST_WIDE_INT)1  shift);
 +   result |= - ((unsigned HOST_WIDE_INT)1  shift);

   return result;
 }


Can you use HOST_WIDE_INT_1U for this?

Ciao!
Steven

Re: [PATCH] Put all MAINTAINERS email addresses into ...

2014-09-26 Thread Steven Bosscher

On Fri, Sep 26, 2014 at 3:09 PM, Segher Boessenkool wrote:
 On Thu, Sep 25, 2014 at 11:10:00PM +0200, Jan-Benedict Glaw wrote:
 Resending this email. Seems some spam filter ate it due to the many
 email addresses...

 Will now all/most/many further patches to MAINTAINERS hit spam filters
 as well?


Let's hope not. But at least for me, all mail to people @ arm.com now bounce...

Ciao!
Steven

Re: [Patch 1/4] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO

2014-09-25 Thread Steven Bosscher

On Thu, Sep 25, 2014 at 4:57 PM, James Greenhalgh wrote:
 * doc/tm.texi.in (MOVE_BY_PIECES_P): Reduce documentation to a stub
 describing that this macro is deprecated.

Remove it entirely and poison it in system.h?
It takes changes to only a few targets: mips, arc, s390, and sh.

Thanks for hookizing this!

Ciao!
Steven

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher

On Wed, Sep 24, 2014 at 8:41 AM, Ilya Enkovich wrote:
 2014-09-23 20:06 GMT+04:00 Jeff Law:
 On 09/23/14 10:01, Steven Bosscher wrote:
 Are you sure this patch is necessary, and is not just papering over
 another problem? In the past, all cases I've seen where labels were
 removed inadvertently were caused by incorrect reference counting or
 missing REG_LABEL_* notes.

 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

It's more specific than that, really:

@item LABEL_PRESERVE_P (@var{x})
In a @code{code_label} or @code{note}, indicates that the label is referenced by
code or data not visible to the RTL of a given function.


The not visible part is important. If there are visible references
to a label, then they should never be removed (obviously) and that
should work through LABEL_NUSES. Unfortunately we are not very good at
keeping LABEL_NUSES up-to-date (this is why all the
rebuild_jump_labels() are still required).

What appears to be the case here, is that you have a label between two
basic blocks B1 and B2, and the label acts as a control flow barrier:
B1 and B2 cannot be merged. Then this should be expressed in the CFG.
Otherwise: What else prevents the merge_blocks CFG hooks from deleting
the label?



 That means even if we do not have any usages
 we shouldn't remove it.

Sorry, no.
Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

If you really want to prevent a label from being deleted, then
LABEL_PRESERVE_P is not a sufficient condition.


  Why can't we add some additional usages
 later?

If you add the usages later, then you're lying to the compiler ;-)



 Did the label use count drop to zero? Is there a REG_LABEL_TARGET note
 for the label operand?

 In the current code of ix86_expand_prologue I don't see any notes
 generation for set_rip_rex64 instruction which actually uses label.
 But IMO this is another potential issue and we still shouldn't remove
 labels with LABEL_PRESERVE_P.

Notes are generated in jump.c:rebuild_jump_labels. They are
automatically added when a label is not

Ciao!
Steven

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher

On Wed, Sep 24, 2014 at 11:57 AM, Ilya Enkovich wrote:
 2014-09-24 13:30 GMT+04:00 Steven Bosscher :
 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

 It's more specific than that, really:

 @item LABEL_PRESERVE_P (@var{x})
 In a @code{code_label} or @code{note}, indicates that the label is 
 referenced by
 code or data not visible to the RTL of a given function.

 I read another description:
 /* 1 if RTX is a code_label that should always be considered to be needed.  */
 #define LABEL_PRESERVE_P(RTX)   \
   (RTL_FLAG_CHECK2 (LABEL_PRESERVE_P, (RTX), CODE_LABEL, NOTE)-in_struct)

Yes, from rtl.h.

I'd recommend to always read the descriptions in doc/ (in this case
doc/rtl.texi). The documentation in the header files is often not
very comprehensive.


 The not visible part is important. If there are visible references
 to a label, then they should never be removed (obviously) and that
 should work through LABEL_NUSES. Unfortunately we are not very good at
 keeping LABEL_NUSES up-to-date (this is why all the
 rebuild_jump_labels() are still required).

 Does rebuild handle all kinds of instructions including those which use 
 UNSPEC?

Yes. Patterns are walked (deep) and REG_LABEL notes are added for all
labels encountered that are not already the JUMP_LABEL of INSN. If the
label is reachable from XEXP(UNSPEC, 0) -- the 'E' operand -- then
that label is visible.


 What appears to be the case here, is that you have a label between two
 basic blocks B1 and B2, and the label acts as a control flow barrier:
 B1 and B2 cannot be merged. Then this should be expressed in the CFG.
 Otherwise: What else prevents the merge_blocks CFG hooks from deleting
 the label?

 Label acts as a barrier here but it is a side effect.  I don't care
 about block merging.  I just don't want label with usages to be
 removed.

Understood. Only, LABEL_PRESERVE_P is not the right means to achieve that.

So let's get back to basics and see what the usages look like. AFAIU
now, you emit the code label early, and add the references much later
(in machine reorg?). Does your UNSPEC have the code_label as an
operand? If so, what breaks if cfgcleanup removes the label? Is the
insn no longer recognized? Or does the label not end up in the
assembly output? Or ...? I can try to help figure out what breaks if
you have a test case.

FWIW, the LABEL_PRESERVE_P uses in config/i386/i386.c look suspect. It
probably only works because those labels are added late, and the code
paths that use (x86_64 large PIC code model) are not tested all that
well...


 That means even if we do not have any usages
 we shouldn't remove it.

 Sorry, no.
 Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
 NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

 According to description you quoted label marked by LABEL_PRESERVE_P
 is used by some code or data.  Let this use be not visible to the RTL
 of a given function.  It is still used, right? How can you remove it?

The code_label rtx is removed, but the label itself is still output to
the object file. The label number is retained in the CODE_LABEL_NUMBER
of the NOTE_INSN_DELETED_LABEL. Look for how NOTE_INSN_DELETED_LABEL
is handled in final.c. It's a hack IMHO, but that's how it has been
since day 0 (see https://gcc.gnu.org/r104).

Ciao!
Steven

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher

On Wed, Sep 24, 2014 at 2:30 PM, Ilya Enkovich wrote:
 I didn't generate references separately from label.  Now I found an
 old patch and a test where this problem appeared.  In this patch I
 moved set_rip generation currently performed in ix86_expand_prologue
 into expand pass.  And I got following code in expand dump for
 testsuite/gcc.target/i386/pr55154.c test:

 (note 7 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
 (note/s 2 7 3 2  NOTE_INSN_DELETED_LABEL 2)
 (insn 3 2 4 2 (set (reg:DI 85)
 (unspec:DI [
 (label_ref [2 deleted])
 ] UNSPEC_SET_RIP))
 /export/users/ienkovic/issues/4161/gcc/gcc/testsuite/gcc.target/i386/pr55154.c:9
 -1
  (insn_list:REG_LABEL_OPERAND 2 (nil)))

 There is a REG_LABEL_OPERAND generated but label is still removed.

Because it should be a REG_LABEL_TARGET?

AFAUI this is a contol flow insn so I'd expect it to be a jump_insn
(and the note will be a TARGET note). But it's not a PC-set insn and a
jump target the compiler will interpret as an infinite loop (if the
insns are really in the order as above) which is clearly not what you
want. So if you emit it as a jump_insn I'm not sure what will
happen...

Is it necessary to emit the label into a basic block?

Ciao!
Steven

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher

On Wed, Sep 24, 2014 at 2:51 PM, Ilya Enkovich wrote:
 2014-09-24 16:47 GMT+04:00 Steven Bosscher :
 It is not a control flow instruction. It copies value of instruction
 pointer into a general purpose register.  Therefore REG_LABEL_OPERAND
 seems to be correct.

OK - sorry for being a bit slow on the up-take, I got confused by the
asm syntax :-)

So I'm going to speculate a bit more... What you want to have is:

foo:
insns...
L2:
leaq L2(%rip), rXX


What happens is that L2 is deleted, which is to say converted to a
NOTE_INSN_DELETED_LABEL. Then the notes are re-ordered
(NOTE_INSN_DELETED_LABEL notes are not tied to anything in the insns
stream and can end up anywhere) so you end up with something like,

foo:
L2: # (was deleted)
insns...
leaq L2(%rip),rXX

I bet you'd find that in the failing test case the label is output to
the assembly file but it's simply in the wrong place.  For the large
code model, we get away with it because the prologue is output late
and the order of the insns is not adjusted (a few passes later, the
CFG doesn't even exist anymore so you don't go through cfgcleanup).
But if you emit the label early and let it go through the entire RTL
pipeline then anything can happen.

If the above makes sense, then you'll want to emit the label late, or
not at all, to the insns stream.

If you emit the label late into the insns stream, you'd rewrite the
set_rip as a define_insn_and_split that emits the label as part of the
last splitting pass. But there is no splitting pass late enough to
guarantee that the label and insns won't get separated.

If you don't emit the label to the insns stream, you would write
ix86_output_set_rip() and call that from the define_insns for set_rip.
You'd not emit the label in the expander. You'd create it and make it
an operand, but not emit it. Your ix86_output_set_rip() would write
the label and the set_rip instruction. This is probably the only way
to make 100% sure that the label is always exactly at the set_rip
instruction.

Something like below (completely untested, etc...).

Hope this helps,

Ciao!
Steven


Index: config/i386/i386-protos.h
===
--- config/i386/i386-protos.h   (revision 215483)
+++ config/i386/i386-protos.h   (working copy)
@@ -303,6 +303,7 @@ extern enum attr_cpu ix86_schedule;
 #endif

 extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
+extern const char * ix86_output_set_rip_insn (rtx *operands);

 #ifdef RTX_CODE
 /* Target data for multipass lookahead scheduling.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 215483)
+++ config/i386/i386.c  (working copy)
@@ -11225,8 +11225,6 @@ ix86_expand_prologue (void)

  gcc_assert (Pmode == DImode);
  label = gen_label_rtx ();
- emit_label (label);
- LABEL_PRESERVE_P (label) = 1;
  tmp_reg = gen_rtx_REG (Pmode, R11_REG);
  gcc_assert (REGNO (pic_offset_table_rtx) != REGNO (tmp_reg));
  insn = emit_insn (gen_set_rip_rex64 (pic_offset_table_rtx,
@@ -12034,8 +12032,6 @@ ix86_expand_split_stack_prologue (void)
  rtx x;

  label = gen_label_rtx ();
- emit_label (label);
- LABEL_PRESERVE_P (label) = 1;
  emit_insn (gen_set_rip_rex64 (reg10, label));
  emit_insn (gen_set_got_offset_rex64 (reg11, label));
  emit_insn (ix86_gen_add3 (reg10, reg10, reg11));
@@ -25016,6 +25012,17 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op

   return ;
 }
+
+/* Output the assembly for a SET_RIP instruction.  We do so with this output
+   function to ensure that the label and %rip load instruction are together. */
+
+const char *
+ix86_output_set_rip_insn (rtx *operands)
+{
+  output_asm_label (operands[1]);
+  output_asm_insn (lea{q}\t{%l1(%%rip), %0|%0, %l1[rip]}, operands);
+  return ;
+}


 /* Clear stack slot assignments remembered from previous functions.
This is called from INIT_EXPANDERS once before RTL is emitted for each
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 215483)
+++ config/i386/i386.md (working copy)
@@ -12010,7 +12010,7 @@
   [(set (match_operand:DI 0 register_operand =r)
(unspec:DI [(label_ref (match_operand 1))] UNSPEC_SET_RIP))]
   TARGET_64BIT
-  lea{q}\t{%l1(%%rip), %0|%0, %l1[rip]}
+  * return ix86_output_set_rip_insn (operands);
   [(set_attr type lea)
(set_attr length_address 4)
(set_attr mode DI)])

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-23 Thread Steven Bosscher

On Fri, Sep 19, 2014 at 10:03 PM, Jeff Law l...@redhat.com wrote:
 On 09/19/14 13:36, Ilya Enkovich wrote:

 Hi,

 During my work on enabling pseudo PIC register I've found that cfg cleaunp
 may remove lables with LABEL_PRESERVE_P set to 1.  In my case I generated
 SET_RIP during expand pass and cfg cleanup removed label it used as an
 operand.  Below is a patch that fixes it.  It is not actually required for
 our latest PIC related patch but still seems to make sense.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 2014-09-19  Ilya Enkovich  ilya.enkov...@intel.com

 * cfgcleanup.c (try_optimize_cfg): Do not remove label
 with LABEL_PRESERVE_P flag set.

 OK.  Please install.

 Note for those not following the x86 32 bit PIC register discussion, I asked
 Ilya to submit this separately.  It was something an earlier version of his
 patch triggered and it stood out as something that ought to be fixed
 regardless of the final form of the PIC register changes that are in
 progress.

Jeff,

Are you sure this patch is necessary, and is not just papering over
another problem? In the past, all cases I've seen where labels were
removed inadvertently were caused by incorrect reference counting or
missing REG_LABEL_* notes.

Did the label use count drop to zero? Is there a REG_LABEL_TARGET note
for the label operand?

Ciao!
Steven

Re: ptx preliminary rtl patches [2/4]

2014-09-11 Thread Steven Bosscher

On Thu, Sep 11, 2014 at 3:25 PM, Bernd Schmidt wrote:
 Bootstrapped and tested on x86_64-linux, together with the other patches.
 Ok?

This is OK.

Ciao!
Steven

Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher

On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:
 nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
 That has exposed a bug in combine where we can end up calling
 num_sign_bit_copies for a BImode value. However, the return value is always
 1 in that case, so it doesn't tell us anything and is going to be
 misinterpreted by the caller.

 Bootstrapped and tested on x86_64-linux, together with the other patches.
 Ok?

This should be handled in num_sign_bit_copies itself, i.e. handle BImode there.

Ciao!
Steven

Re: ptx preliminary rtl patches [4/4]

2014-09-11 Thread Steven Bosscher

On Thu, Sep 11, 2014 at 3:27 PM, Bernd Schmidt wrote:
 It turns out that we're calling eliminate_regs for global variables which
 can't possibly have eliminable regs in their decl. At that point,
 reg_eliminate can be NULL. This patch avoids unnecessary work, and allows us
 to add an assert to eliminate_regs later.

 Bootstrapped and tested on x86_64-linux, together with the other patches.
 Ok?

Why not use is_global_var()?

Ciao!
Steven

Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher

On Thu, Sep 11, 2014 at 6:06 PM, Bernd Schmidt wrote:
 On 09/11/2014 05:55 PM, Steven Bosscher wrote:

 On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:

 nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
 That has exposed a bug in combine where we can end up calling
 num_sign_bit_copies for a BImode value. However, the return value is
 always
 1 in that case, so it doesn't tell us anything and is going to be
 misinterpreted by the caller.

 Bootstrapped and tested on x86_64-linux, together with the other patches.
 Ok?


 This should be handled in num_sign_bit_copies itself, i.e. handle BImode
 there.


 What do you expect that function to do different? It returns the correct
 value.


No different. Just that if you want to check whether DECL is a global
variable then we have a predicate for it. So why use TREE_STATIC
instead?

In other words: Just trying to make/keep certain checks consistent. (A
hopeless cause, but a noble one... ;-)

Ciao!
Steven

Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher

On Thu, Sep 11, 2014 at 6:19 PM, Bernd Schmidt wrote:
 What do you expect that function to do different? It returns the correct
 value.


 No different. Just that if you want to check whether DECL is a global
 variable then we have a predicate for it. So why use TREE_STATIC
 instead?

 In other words: Just trying to make/keep certain checks consistent. (A
 hopeless cause, but a noble one... ;-)


 You're talking about a different patch here. This one is about
 num_sign_bit_copies.


Ah. *sigh* can't even keep two patches in my mind at any one time.

The point about num_sign_bit_copies is that it doesn't really return
the correct value IMHO, if there isn't really a correct value to speak
of: What is the sign of TRUE or FALSE, the only two values a BImode
value can take?

A 1-bit precision integer can have value 0 or -1 and in that case
num_sign_bit_copies should be 0. But for a BImode value, it seems to
me that asking for the sign bit or sign bit copies is just wrong.

Ciao!
Steven

Re: [PATCH] Force rtl templates to be inlined

2014-09-02 Thread Steven Bosscher

On Tue, Sep 2, 2014 at 9:22 AM, Andrew Pinski wrote:
 On Tue, Sep 2, 2014 at 12:20 AM, Andi Kleen wrote:

 there have been bugs in the past in the area of always_inline too.

 You're arguing for my patch. It would find those bugs.


 No I am arguing against it since the older versions of GCC we cannot change.

Should such bugs turn up, we can account for them in ansidecl.h.

I think Andi's patch should go in.

Ciao!
Steven

Re: [FORTRAN PATCH] Quash two -Wlogical-not-parentheses warnings

2014-09-01 Thread Steven Bosscher

On Mon, Sep 1, 2014 at 3:23 PM, Marek Polacek wrote:

 diff --git gcc/fortran/interface.c gcc/fortran/interface.c
 index b210d18..68d8545 100644
 --- gcc/fortran/interface.c
 +++ gcc/fortran/interface.c
 @@ -2014,7 +2014,7 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual,
if (formal-ts.type == BT_CLASS  formal-attr.class_ok
 actual-expr_type != EXPR_NULL
 ((CLASS_DATA (formal)-attr.class_pointer
 -   !formal-attr.intent == INTENT_IN)
 +   (!formal-attr.intent) == INTENT_IN)
|| CLASS_DATA (formal)-attr.allocatable))
  {
if (actual-ts.type != BT_CLASS)

This is certainly not OK, intent is a tri-state.


 diff --git gcc/fortran/trans-expr.c gcc/fortran/trans-expr.c
 index f2ed474..6592c7e 100644
 --- gcc/fortran/trans-expr.c
 +++ gcc/fortran/trans-expr.c
 @@ -4589,7 +4589,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
e-expr_type == EXPR_VARIABLE
(!e-ref
   || (e-ref-type == REF_ARRAY
 -  !e-ref-u.ar.type != AR_FULL))
 +  (!e-ref-u.ar.type) != AR_FULL))
e-symtree-n.sym-attr.optional)
 {
   tmp = fold_build3_loc (input_location, COND_EXPR,

Also not OK.

You probably want to wrap the (in)equality tests in parenthesis.

Ciao!
Steven

[patch] Why xstrdup cgraph node names for dumpfiles?

2014-08-26 Thread Steven Bosscher

Hello Martin, Honza,

I noticed most of the cgraph and IPA files use xstrdup for cgraph node
names when printing to dump_file. Very leaky...

What is the reason for all those xstrdups? I couldn't think of any.

Thoughts?

Ciao!
Steven

* cgraph.c (cgraph_node::get_create): Don't xstrdup cgraph node names.
(cgraph_edge::make_speculative): Likewise.
(cgraph_edge::resolve_speculation): Likewise.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
(cgraph_node::dump): Likewise.
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* ipa-cp.c (perhaps_add_new_callers): Likewise.
* ipa-inline.c (report_inline_failed_reason): Likewise.
(want_early_inline_function_p): Likewise.
(edge_badness): Likewise.
(update_edge_key): Likewise.
(flatten_function): Likewise.
(inline_always_inline_functions): Likewise.
(early_inline_small_functions): Likewise.
* ipa-profile.c (ipa_profile): Likewise.
* ipa-prop.c (ipa_print_node_jump_functions): Likewise.
(ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
(propagate_controlled_uses): Likewise.
* ipa-utils.c (ipa_merge_profiles): Likewise.
* tree-sra.c (convert_callers_for_node): Likewise.

Index: cgraph.c
===
--- cgraph.c(revision 214545)
+++ cgraph.c(working copy)
@@ -489,11 +489,11 @@ cgraph_node::get_create (tree decl)
   if (dump_file)
fprintf (dump_file, Introduced new external node 
 (%s/%i) and turned into root of the clone tree.\n,
-xstrdup (node-name ()), node-order);
+node-name (), node-order);
 }
   else if (dump_file)
 fprintf (dump_file, Introduced new external node 
-(%s/%i).\n, xstrdup (node-name ()),
+(%s/%i).\n, node-name (),
 node-order);
   return node;
 }
@@ -1036,8 +1036,8 @@ cgraph_edge::make_speculative (cgraph_node *n2, gc
 {
   fprintf (dump_file, Indirect call - speculative call
%s/%i = %s/%i\n,
-  xstrdup (n-name ()), n-order,
-  xstrdup (n2-name ()), n2-order);
+  n-name (), n-order,
+  n2-name (), n2-order);
 }
   speculative = true;
   e2 = n-create_edge (n2, call_stmt, direct_count, direct_frequency);
@@ -1155,16 +1155,16 @@ cgraph_edge::resolve_speculation (tree callee_decl
{
  fprintf (dump_file, Speculative indirect call %s/%i = %s/%i has 

   turned out to have contradicting known target ,
-  xstrdup (edge-caller-name ()), edge-caller-order,
-  xstrdup (e2-callee-name ()), e2-callee-order);
+  edge-caller-name (), edge-caller-order,
+  e2-callee-name (), e2-callee-order);
  print_generic_expr (dump_file, callee_decl, 0);
  fprintf (dump_file, \n);
}
  else
{
  fprintf (dump_file, Removing speculative call %s/%i = %s/%i\n,
-  xstrdup (edge-caller-name ()), edge-caller-order,
-  xstrdup (e2-callee-name ()), e2-callee-order);
+  edge-caller-name (), edge-caller-order,
+  e2-callee-name (), e2-callee-order);
}
}
 }
@@ -1284,9 +1284,9 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
  if (dump_file)
fprintf (dump_file, Not expanding speculative call of %s/%i - 
%s/%i\n
 Type mismatch.\n,
-xstrdup (e-caller-name ()),
+e-caller-name (),
 e-caller-order,
-xstrdup (e-callee-name ()),
+e-callee-name (),
 e-callee-order);
  e = e-resolve_speculation ();
  /* We are producing the final function body and will throw away the
@@ -1303,9 +1303,9 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
fprintf (dump_file,
 Expanding speculative call of %s/%i - %s/%i count:
 %PRId64\n,
-xstrdup (e-caller-name ()),
+e-caller-name (),
 e-caller-order,
-xstrdup (e-callee-name ()),
+e-callee-name (),
 e-callee-order,
 (int64_t)e-count);
  gcc_assert (e2-speculative);
@@ -1353,8 +1353,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
   if (symtab-dump_file)
 {
   fprintf (symtab-dump_file, updating call of %s/%i - %s/%i: ,
-  xstrdup (e-caller-name ()), e-caller-order,
-  xstrdup (e-callee-name ()), e-callee-order);
+  e-caller-name (), e-caller-order,
+  e-callee-name (), e-callee-order);

Re: [patch] Why xstrdup cgraph node names for dumpfiles?

2014-08-26 Thread Steven Bosscher

On Tue, Aug 26, 2014 at 10:52 PM, Uros Bizjak wrote:
 Hello!

 I noticed most of the cgraph and IPA files use xstrdup for cgraph node
 names when printing to dump_file. Very leaky...

 What is the reason for all those xstrdups? I couldn't think of any.

 Please see [1] and [2].

 [1] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01904.html
 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53136


Thanks. Well, that is surprising. The cxx_printable_name_internal uses
a ring buffer for caching the names. Apparently that didn't work in
your PR.

The leaks get quite large over time, especially with a large code base
and many small functions. I noticed this when compiling a large code
base with dumping enabled and ran out of memory.

I assume you also tried increasing the ring buffer size? Oh well,
probably not very robust, and we do seem to leak a lot of strings in
general...

Ciao!
Steven

Re: [BUILDROBOT][PATCH] frv-linux fallout (was: [PATCH 009/236] Replace BB_HEAD et al macros with functions)

2014-08-25 Thread Steven Bosscher

On Mon, Aug 25, 2014 at 9:29 PM, Mike Stump wrote:
 On Aug 25, 2014, at 7:08 AM, David Malcolm wrote:
 It's too late now to switch to this approach, so in the meantime I've
 been working on ways to make my bootstraps as fast as possible.

 -j64 works wonders.  :-)  Though, it is annoying watching the build run at 
 98% idle.

And genautomata just sits there... waiting... waiting... wai..

Ciao!
Steven

[patch] PR fortran/61669

2014-08-23 Thread Steven Bosscher

Hello,

This bug is an error recovery issue. A data declaration is parsed and
accepted, and added to gfc_current_ns-data, but the statement is
rejected. The rejected data decl is not rolled back, causing memory
corruption later on.

Proposed fix is to roll back DATA for rejected statements.

Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven
fortran/
PR fortran/61669
* gfortran.h (struct gfc_namespace): Add OLD_DATA field.
* decl.c (gfc_reject_data): New function.
* parse.c *use_modules): Record roll-back point.
(next_statement): Likewise.
(reject_statement): Roll back to last accepted DATA.

testsuite/
PR fortran/61669
* gfortran.dg/pr61669.f90: New test.

Index: fortran/gfortran.h
===
--- fortran/gfortran.h  (revision 214350)
+++ fortran/gfortran.h  (working copy)
@@ -1625,7 +1625,7 @@ typedef struct gfc_namespace
   gfc_st_label *st_labels;
   /* This list holds information about all the data initializers in
  this namespace.  */
-  struct gfc_data *data;
+  struct gfc_data *data, *old_data;
 
   gfc_charlen *cl_list, *old_cl_list;
 
@@ -2941,6 +2941,7 @@ void gfc_free_omp_namelist (gfc_omp_namelist *);
 void gfc_free_equiv (gfc_equiv *);
 void gfc_free_equiv_until (gfc_equiv *, gfc_equiv *);
 void gfc_free_data (gfc_data *);
+void gfc_reject_data (gfc_namespace *);
 void gfc_free_case_list (gfc_case *);
 
 /* matchexp.c -- FIXME too?  */
Index: fortran/decl.c
===
--- fortran/decl.c  (revision 214350)
+++ fortran/decl.c  (working copy)
@@ -178,7 +178,21 @@ gfc_free_data_all (gfc_namespace *ns)
 }
 }
 
+/* Reject data parsed since the last restore point was marked.  */
 
+void
+gfc_reject_data (gfc_namespace *ns)
+{
+  gfc_data *d;
+
+  while (ns-data  ns-data != ns-old_data)
+{
+  d = ns-data-next;
+  free (ns-data);
+  ns-data = d;
+}
+}
+
 static match var_element (gfc_data_variable *);
 
 /* Match a list of variables terminated by an iterator and a right
Index: fortran/parse.c
===
--- fortran/parse.c (revision 214350)
+++ fortran/parse.c (working copy)
@@ -118,6 +118,7 @@ use_modules (void)
   gfc_warning_check ();
   gfc_current_ns-old_cl_list = gfc_current_ns-cl_list;
   gfc_current_ns-old_equiv = gfc_current_ns-equiv;
+  gfc_current_ns-old_data = gfc_current_ns-data;
   last_was_use_stmt = false;
 }
 
@@ -1097,6 +1098,7 @@ next_statement (void)
 
   gfc_current_ns-old_cl_list = gfc_current_ns-cl_list;
   gfc_current_ns-old_equiv = gfc_current_ns-equiv;
+  gfc_current_ns-old_data = gfc_current_ns-data;
   for (;;)
 {
   gfc_statement_label = NULL;
@@ -2045,6 +2047,8 @@ reject_statement (void)
   gfc_free_equiv_until (gfc_current_ns-equiv, gfc_current_ns-old_equiv);
   gfc_current_ns-equiv = gfc_current_ns-old_equiv;
 
+  gfc_reject_data (gfc_current_ns);
+
   gfc_new_block = NULL;
   gfc_undo_symbols ();
   gfc_clear_warning ();
Index: testsuite/gfortran.dg/pr61669.f90
===
--- testsuite/gfortran.dg/pr61669.f90   (revision 0)
+++ testsuite/gfortran.dg/pr61669.f90   (working copy)
@@ -0,0 +1,8 @@
+! { dg-do compile }
+  write (*,(a)) char(12)
+  CHARACTER*80 A /A/  ! { dg-error Unexpected data declaration 
statement }
+  REAL*4 B  ! { dg-error Unexpected data declaration 
statement }
+  write (*,(a)) char(12)
+  DATA B / 0.02 /
+  END
+

[patch] PR fortran/62135

2014-08-21 Thread Steven Bosscher

Hello,

Low-hanging fruit, almost embarassing to fix. But then again I caused
this bug -- in 1999 or so :-)

Will commit after testing.

Ciao!
Steven


fortran/
* resolve.c (resolve_select): Fix list traversal in case the
last element of the CASE list was dropped as unreachable code.

testsuite/
* gfortran.dg/pr62135.f90: New test.

Index: fortran/resolve.c
===
--- fortran/resolve.c   (revision 214292)
+++ fortran/resolve.c   (working copy)
@@ -7761,7 +7761,7 @@ resolve_select (gfc_code *code, bool select_type)
/* Strip all other unreachable cases.  */
if (body-ext.block.case_list)
  {
-   for (cp = body-ext.block.case_list; cp-next; cp = cp-next)
+   for (cp = body-ext.block.case_list; cp  cp-next; cp = cp-next)
  {
if (cp-next-unreachable)
  {
Index: testsuite/gfortran.dg/pr62135.f90
===
--- testsuite/gfortran.dg/pr62135.f90   (revision 0)
+++ testsuite/gfortran.dg/pr62135.f90   (working copy)
@@ -0,0 +1,17 @@
+! { dg-do compile }
+! { dg-options -Wsurprising }
+
+   PROGRAM PR62135
+  IMPLICIT NONE
+  CHARACTER*1 :: choice
+  choice = 'x'
+  SELECT CASE (choice)
+ ! This triggered an ICE: an unreachable case clause
+ ! as the last of a list.
+ CASE ('2':'7','9':'0') ! { dg-warning can never be matched }
+WRITE(*,*) barf
+ CASE DEFAULT
+CONTINUE
+  END SELECT
+   END PROGRAM PR62135
+

Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2014-08-15 Thread Steven Bosscher

On Fri, Aug 15, 2014 at 5:45 PM, Robert Suchanek wrote:
 gcc/
 * rtlanal.c (get_base_term): Accept HIGH as the base term.


 diff --git gcc/rtlanal.c gcc/rtlanal.c
 index 82cfc1bf..2bea2ca 100644
 --- gcc/rtlanal.c
 +++ gcc/rtlanal.c
 @@ -5624,6 +5624,7 @@ get_base_term (rtx *inner)
  inner = strip_address_mutations (XEXP (*inner, 0));
if (REG_P (*inner)
|| MEM_P (*inner)
 +  || GET_CODE (*inner) == HIGH
|| GET_CODE (*inner) == SUBREG)
  return inner;
return 0;

This is not correct, BASE is a *variable* expression, HIGH is a
*constant* expression.

It's hard to say what the correct fix should be, but it sounds like
the address you get after the substitutions should be simplified
(folded).

B.R.,
Steven

Re: [PATCH] Add target macro to override DWARF2 frame register size

2014-08-01 Thread Steven Bosscher

On Fri, Aug 1, 2014 at 3:31 PM, Matthew Fortune wrote:
 This patch adds a target macro

Please don't add target macros. Add a hook if you must, but we're
supposed to remove target macros, not add new ones :-)

Ciao!
Steven

Re: [PATCH] Fix INSN_TICK heuristic for SCHED_PRESSURE

2014-07-14 Thread Steven Bosscher

On Mon, Jul 14, 2014 at 6:12 AM, Maxim Kuvyrkov wrote:
 Hi,

 This patch fixes 2 of the [several] problems in rank_for_schedule heuristics 
 for SCHED_PRESSURE_MODEL.  SCHED_PRESSURE_MODEL is used for first scheduling 
 pass in ARM backend by default.

 The first one is when INSN_TICK of both instructions are equal, and 
 rank_for_schedule returns a tie result, even though there are more 
 heuristics down the path to break the tie.

 The second one is to account for the fact that model_index() of two 
 instructions is meaningful only when both instructions are in the current 
 basic block.

 Bootstrapped and tested on {x86_64,arm,aarch64}-linux.

 OK to apply?

s/INSN_BB/BLOCK_FOR_INSN/

OK with that change.

Ciao!
Steven

Re: [PATCH] Fix INSN_TICK heuristic for SCHED_PRESSURE

2014-07-14 Thread Steven Bosscher

On Mon, Jul 14, 2014 at 10:09 AM, Maxim Kuvyrkov wrote:
 On Jul 14, 2014, at 8:05 PM, Steven Bosscher wrote:

 On Mon, Jul 14, 2014 at 6:12 AM, Maxim Kuvyrkov wrote:
 Hi,

 This patch fixes 2 of the [several] problems in rank_for_schedule 
 heuristics for SCHED_PRESSURE_MODEL.  SCHED_PRESSURE_MODEL is used for 
 first scheduling pass in ARM backend by default.

 The first one is when INSN_TICK of both instructions are equal, and 
 rank_for_schedule returns a tie result, even though there are more 
 heuristics down the path to break the tie.

 The second one is to account for the fact that model_index() of two 
 instructions is meaningful only when both instructions are in the current 
 basic block.

 Bootstrapped and tested on {x86_64,arm,aarch64}-linux.

 OK to apply?

 s/INSN_BB/BLOCK_FOR_INSN/

 That would be a mistake, see definition of INSN_BB in sched-int.h.  Scheduler 
 uses its own basic block numbering.

Eh, right. I seem to be confused here with older sched CFG oddities
but this isn't one of those...



 OK with that change.

 I assume OK without the change?

Yup.

Ciao!
Steven

Re: Fix PR61772: ifcvt removing asm volatile gotos

2014-07-11 Thread Steven Bosscher

On Fri, Jul 11, 2014 at 2:34 PM, Michael Matz wrote:
 PR rtl-optimization/61772
 * ifcvt.c (dead_or_predicable): Check jump to be free of side
 effects.

This is OK.

Ciao!
Steven

Re: Instructions vs Expressions in the backend (was Re: RFA: Rework FOR_BB_INSNS iterators)

2014-06-25 Thread Steven Bosscher

On Wed, Jun 25, 2014 at 10:46 PM, Jeff Law wrote:
 On 06/25/14 02:54, Richard Sandiford wrote:

 SEQUENCE is just weird though :-)  It would be good to have an alternative
 representation, but that'd be a lot of work on reorg.

 Yea.  And I don't think anyone is keen on rewriting reorg.

Rewriting/revamping reorg is not really the problem, IMHO. Last year I
hacked a bit on a new delayed-branch scheduler, and I got results that
were not too bad (especially considering my GCC time is only a few
hours per week).

The the real problem is designing that alternative representation of
delay slots, and teaching the back ends about that. Just communicating
a delayed-branch sequence to the back ends is pretty awful (a global
variable in final.c) and a lot of back-end code has non-obvious
assumptions about jumping across insns contained in SEQUENCEs. There's
also one back end (mep?) that uses SEQUENCE for bundles (RTL abuse is
not considered bad practice by all maintainers ;-).

(I actually found SEQUENCE to be quite nice to work with when I
allowed them to be the last insn in a basic block. One of my goals was
to retain the CFG across dbr_sched, but that turned out to be blocked
by other things than dbr_sched, like fake insns that some back ends
emit between basic blocks, e.g. for constant pools).

Having some kind of insns container like SEQUENCE would IMHO not be
a bad thing, perhaps a necessity, and perhaps even an improvement
(like for representing bundles), as long as we can assign sane
semantics to it w.r.t. next/prev insn. SEQUENCE wasn't designed with
its current application in mind...

Ciao!
Steven

Re: breakage with [PATCH 1/6] Add FOR_EACH_INSN{_INFO}_{DEFS,USES,EQ_USES}

2014-06-16 Thread Steven Bosscher

On Mon, Jun 16, 2014 at 12:36 AM, Hans-Peter Nilsson wrote:
 On Sun, 15 Jun 2014, Steven Bosscher wrote:
 On Sun, Jun 15, 2014 at 1:27 PM, Hans-Peter Nilsson wrote:
  /tmp/hpautotest-gcc0/gcc/gcc/auto-inc-dec.c: In function 'void
  merge_in_block(int, basic_block_def*)':
  /tmp/hpautotest-gcc0/gcc/gcc/auto-inc-dec.c:1442: error: 'uid'
  was not declared in this scope
  make[2]: *** [auto-inc-dec.o] Error 1
 
  brgds, H-P


 Bah, this is why I just *hate* all the gcc code that's only compiled
 if certain #defines are set...

 I couldn't agree more.  Might not have been obvious when writing
 the mosly-mechanical patch, still the auto-inc-dec.c name should
 have been a red flag that a representative target should have
 been tested (i.e. not x86_64 and i686).

I agree, but I think you'd agree with me if I say that Richard S. is
one of the few people who almost always goes beyond the normal amount
of testing required for a patch. Breakage like this will just happen
to us all, every once in a while, until we compile all middle-end code
at least, regardless of #defines and whatnot (conditionally compiled
code, from the top of my head: CC0, scheduler, dbrsched, auto-inc-dec,
HAVE_conditional_move, etc...).

Ciao!
Steven

Re: breakage with [PATCH 1/6] Add FOR_EACH_INSN{_INFO}_{DEFS,USES,EQ_USES}

2014-06-15 Thread Steven Bosscher

On Sun, Jun 15, 2014 at 1:27 PM, Hans-Peter Nilsson wrote:
 On Sat, 14 Jun 2014, Richard Sandiford wrote:

 To make the final representation change easier, this patch introduces
 macros for iterating over lists of defs, uses and eq_uses.  At the
 moment there are three possible keys when accessing df_ref lists:
 the insn rtx (DF_INSN_*), the insn uid (DF_INSN_UID_*) and the
 df_insn_info (DF_INSN_INFO_*).  I don't think it's worth adding
 iterators for uids though.  Any code that's going to the trouble of
 caching the uid might as well go the whole hog and cache the underlying
 df_insn_info.

 After the feedback to the BB iterator patch:

   https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00676.html

 I've kept the iterator variable definitions outside the FOR_* macros
 rather than make the FOR_* macros define the variables themselves.

 Richard


 gcc/
   * df.h (DF_INSN_INFO_MWS, FOR_EACH_INSN_INFO_DEF): New macros.
   (FOR_EACH_INSN_INFO_USE, FOR_EACH_INSN_INFO_EQ_USE): Likewise.
   (FOR_EACH_INSN_DEF, FOR_EACH_INSN_USE, FOR_EACH_INSN_EQ_USE): Likewise.
   * auto-inc-dec.c (find_inc, merge_in_block): Use them.


 One of these patches (in 211677:211684) broke cris-elf:

 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE
 -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
 \
 -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
 -Woverloaded-virtual -pedantic -Wno-long-long
 -Wno-variadic-macr\
 os -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I.
 -I/tmp/hpautotest-gcc0/gcc/gcc -I/tmp/hpautotest-gcc0/g\
 cc/gcc/. -I/tmp/hpautotest-gcc0/gcc/gcc/../include
 -I/tmp/hpautotest-gcc0/gcc/gcc/../libcpp/include
 -I/tmp/hpautotest-g\
 cc0/cris-elf/gccobj/./gmp -I/tmp/hpautotest-gcc0/gcc/gmp
 -I/tmp/hpautotest-gcc0/cris-elf/gccobj/./mpfr -I/tmp/hpautotes\
 t-gcc0/gcc/mpfr -I/tmp/hpautotest-gcc0/gcc/mpc/src
 -I/tmp/hpautotest-gcc0/gcc/gcc/../libdecnumber
 -I/tmp/hpautotest-gc\
 c0/gcc/gcc/../libdecnumber/dpd -I../libdecnumber
 -I/tmp/hpautotest-gcc0/gcc/gcc/../libbacktrace-o
 auto-inc-dec.o -M\
 T auto-inc-dec.o -MMD -MP -MF ./.deps/auto-inc-dec.TPo
 /tmp/hpautotest-gcc0/gcc/gcc/auto-inc-dec.c
 /tmp/hpautotest-gcc0/gcc/gcc/auto-inc-dec.c: In function 'void
 merge_in_block(int, basic_block_def*)':
 /tmp/hpautotest-gcc0/gcc/gcc/auto-inc-dec.c:1442: error: 'uid'
 was not declared in this scope
 make[2]: *** [auto-inc-dec.o] Error 1

 brgds, H-P


Bah, this is why I just *hate* all the gcc code that's only compiled
if certain #defines are set...

Can you please try:

Index: auto-inc-dec.c
===
--- auto-inc-dec.c  (revision 211685)
+++ auto-inc-dec.c  (working copy)
@@ -1439,7 +1439,8 @@ merge_in_block (int max_reg, basic_block bb)
}
}
   else if (dump_file)
-   fprintf (dump_file, skipping update of deleted insn %d\n, uid);
+   fprintf (dump_file, skipping update of deleted insn %d\n,
+INSN_UID (insn));
 }

   /* If we were successful, try again.  There may have been several

Ciao!
Steven

Re: [PATCH 1/6] Add FOR_EACH_INSN{_INFO}_{DEFS,USES,EQ_USES}

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:43 PM, Richard Sandiford wrote:
 gcc/
 * df.h (DF_INSN_INFO_MWS, FOR_EACH_INSN_INFO_DEF): New macros.
 (FOR_EACH_INSN_INFO_USE, FOR_EACH_INSN_INFO_EQ_USE): Likewise.
 (FOR_EACH_INSN_DEF, FOR_EACH_INSN_USE, FOR_EACH_INSN_EQ_USE): 
 Likewise.
 * auto-inc-dec.c (find_inc, merge_in_block): Use them.
 * combine.c (create_log_links): Likewise.
 * compare-elim.c (find_flags_uses_in_insn): Likewise.
 (try_eliminate_compare): Likewise.
 * cprop.c (make_set_regs_unavailable, mark_oprs_set): Likewise.
 * dce.c (deletable_insn_p, find_call_stack_args): Likewise.
 (remove_reg_equal_equiv_notes_for_defs): Likewise.
 (reset_unmarked_insns_debug_uses, mark_reg_dependencies): Likewise.
 (word_dce_process_block, dce_process_block): Likewise.
 * ddg.c (def_has_ccmode_p): Likewise.
 * df-core.c (df_bb_regno_first_def_find): Likewise.
 (df_bb_regno_last_def_find, df_find_def, df_find_use): Likewise.
 * df-problems.c (df_rd_simulate_one_insn): Likewise.
 (df_lr_bb_local_compute, df_live_bb_local_compute): Likewise.
 (df_chain_remove_problem, df_chain_insn_top_dump): Likewise.
 (df_chain_insn_bottom_dump, df_word_lr_bb_local_compute): Likewise.
 (df_word_lr_simulate_defs, df_word_lr_simulate_uses): Likewise.
 (df_remove_dead_eq_notes, df_note_bb_compute): Likewise.
 (df_simulate_find_defs, df_simulate_find_uses): Likewise.
 (df_simulate_find_noclobber_defs, df_simulate_defs): Likewise.
 (df_simulate_uses, df_md_simulate_one_insn): Likewise.
 * df-scan.c (df_reorganize_refs_by_reg_by_insn): Likewise.
 * fwprop.c (local_ref_killed_between_p): Likewise.
 (all_uses_available_at, free_load_extend): Likewise.
 * gcse.c (update_bb_reg_pressure, calculate_bb_reg_pressure): 
 Likewise.
 * hw-doloop.c (scan_loop): Likewise.
 * ifcvt.c (dead_or_predicable): Likewise.
 * init-regs.c (initialize_uninitialized_regs): Likewise.
 * ira-lives.c (mark_hard_reg_early_clobbers): Likewise.
 (process_bb_node_lives): Likewise.
 * ira.c (compute_regs_asm_clobbered, build_insn_chain): Likewise.
 (find_moveable_pseudos): Likewise.
 * loop-invariant.c (check_dependencies, record_uses): Likewise.
 * recog.c (peep2_find_free_register): Likewise.
 * ree.c (get_defs): Likewise.
 * regstat.c (regstat_bb_compute_ri): Likewise.
 (regstat_bb_compute_calls_crossed): Likewise.
 * sched-deps.c (find_inc, find_mem): Likewise.
 * sel-sched-ir.c (maybe_downgrade_id_to_use): Likewise.
 (maybe_downgrade_id_to_use, setup_id_reg_sets): Likewise.
 * shrink-wrap.c (requires_stack_frame_p): Likewise.
 (prepare_shrink_wrap): Likewise.
 * store-motion.c (compute_store_table, build_store_vectors): Likewise.
 * web.c (union_defs, pass_web::execute): Likewise.
 * config/i386/i386.c (increase_distance, insn_defines_reg): Likewise.
 (insn_uses_reg_mem, ix86_ok_to_clobber_flags): Likewise.

OK. Nice :-)

Ciao!
Steven

Re: [PATCH 2/6] FOR_EACH_ARTIFICIAL_{DEF,USE}

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:44 PM, Richard Sandiford wrote:
 gcc/
 * df.h (FOR_EACH_ARTIFICIAL_USE, FOR_EACH_ARTIFICIAL_DEF): New macros.
 * cse.c (cse_extended_basic_block): Use them.
 * dce.c (mark_artificial_use): Likewise.
 * df-problems.c (df_rd_simulate_artificial_defs_at_top): Likewise.
 (df_lr_bb_local_compute, df_live_bb_local_compute): Likewise.
 (df_chain_remove_problem, df_chain_bb_dump): Likewise.
 (df_word_lr_bb_local_compute, df_note_bb_compute): Likewise.
 (df_simulate_initialize_backwards): Likewise.
 (df_simulate_finalize_backwards): Likewise.
 (df_simulate_initialize_forwards): Likewise.
 (df_md_simulate_artificial_defs_at_top): Likewise.
 * df-scan.c (df_reorganize_refs_by_reg_by_insn): Likewise.
 * regrename.c (init_rename_info): Likewise.
 * regstat.c (regstat_bb_compute_ri): Likewise.
 (regstat_bb_compute_calls_crossed): Likewise.

OK.

Ciao!
Steven

Re: [PATCH 3/6] Add FOR_EACH_INSN_INFO_MW

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:45 PM, Richard Sandiford wrote:
 gcc/
 * df.h (FOR_EACH_INSN_INFO_MW): New macro.
 * df-problems.c (df_note_bb_compute): Use it.
 * regstat.c (regstat_bb_compute_ri): Likewise.

OK.

Ciao!
Steven

Re: [PATCH 4/6] Add df_single_{def,use} helper functions

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:47 PM, Richard Sandiford wrote:
 IRA wants to know whether a particular insn has a single def or use.
 That seems worth putting in a utility function, again to hide the
 underlying representation a bit.

I would have sworn we already had functions for this, but apparently not...

 * ira.c (find_moveable_pseudos): Use them.

OK.

Ciao!
Steven

Re: [PATCH 5/6] Remove dead code

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:48 PM, Richard Sandiford wrote:
 This patch removes some dead code that would otherwise need to be
 changed in the final patch.

These functions were intended to allow passes to update DF info
manually. That never was a very practical idea, apparently.

 gcc/
 * df.h (df_ref_create, df_ref_remove): Delete.
 * df-scan.c (df_ref_create, df_ref_compress_rec): Likewise.
 (df_ref_remove): Likewise.

OK.

Ciao!
Steven

Re: [PATCH 6/6] Use a linked list for insn defs and uses

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:53 PM, Richard Sandiford wrote:
 gcc/
 * df.h (df_mw_hardreg, df_base_ref): Add a link pointer.
 (df_insn_info): Turn defs, uses, eq_uses and mw_hardregs into linked
 lists.
 (df_scan_bb_info): Likewise artificial_defs and artificial_uses.
 (FOR_EACH_INSN_INFO_DEF, FOR_EACH_INSN_INFO_USE)
 (FOR_EACH_INSN_INFO_EQ_USE, FOR_EACH_INSN_INFO_MW)
 (FOR_EACH_ARTIFICIAL_USE, FOR_EACH_ARTIFICIAL_DEF)
 (df_get_artificial_defs, df_get_artificial_uses)
 (df_single_def, df_single_use): Update accordingly.
 (df_refs_chain_dump): Take the first element in a linked list as
 parameter, rather than a pointer to an array of pointers.
 * df-core.c (df_refs_chain_dump, df_mws_dump): Likewise.
 * df-problems.c (df_rd_bb_local_compute_process_def): Likewise.
 (df_chain_create_bb_process_use): Likewise.
 (df_md_bb_local_compute_process_def): Likewise.
 * fwprop.c (process_defs, process_uses): Likewise.
 (register_active_defs, update_uses): Likewise.
 (forward_propagate_asm): Update for new df_ref linking.
 * df-scan.c (df_scan_free_ref_vec, df_scan_free_mws_vec): Delete.
 (df_null_ref_rec, df_null_mw_rec): Likewise.
 (df_scan_free_internal): Don't free df_ref and df_mw_hardreg lists
 explicitly.
 (df_scan_free_bb_info): Remove check for null artificial_defs.
 (df_install_ref_incremental): Adjust for new df_ref linking.
 Use a single-element insertion rather than a full sort.
 (df_ref_chain_delete_du_chain): Take the first element
 in a linked list as parameter, rather than a pointer to an array of
 pointers.
 (df_ref_chain_delete, df_mw_hardreg_chain_delete): Likewise.
 (df_add_refs_to_table, df_refs_verify, df_mws_verify): Likewise.
 (df_insn_info_delete): Remove check for null defs and call to
 df_scan_free_mws_vec.
 (df_insn_rescan): Initialize df_ref and df_mw_hardreg lists to
 null rather than df_null_*_rec.
 (df_insn_rescan_debug_internal): Likewise, and update null
 checks in the same way.  Remove check for null defs.
 (df_ref_change_reg_with_loc_1): Fix choice of list for defs.
 Move a single element rather doing a full sort.
 (df_mw_hardreg_chain_delete_eq_uses): Adjust for new df_mw_hardreg
 linking.
 (df_notes_rescan): Likewise.  Use a merge rather than a full sort.
 Initialize df_ref and df_mw_hardreg lists to null rather than
 df_null_*_rec.
 (df_ref_compare): Take df_refs as parameter, transferring the
 old interface to...
 (df_ref_ptr_compare): ...this new function.
 (df_sort_and_compress_refs): Update accordingly.
 (df_mw_compare): Take df_mw_hardregs as parameter, transferring the
 old interface to...
 (df_mw_ptr_compare): ...this new function.
 (df_sort_and_compress_mws): Update accordingly.
 (df_install_refs, df_install_mws): Return a linked list rather than
 an array of pointers.
 (df_refs_add_to_chains): Assert that old lists are empty rather
 than freeing them.
 (df_insn_refs_verify): Don't handle null defs speciailly.
 * web.c (union_match_dups): Update for new df_ref linking.


I would prefer a macro for base.next_loc (DF_REF_NEXT_LOC?).

Other than that: OK.

Ciao!
Steven

Re: Remove some unnecessary null checks in df.h

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 10:02 PM, Richard Sandiford wrote:
 DF_REF_REG_USE_P and DF_MWS_REG_USE_P checked for null arguments
 but the def equivalents didn't.  There doesn't seem to be any
 need for this difference.

Right, looking at the users (one user for each) of these macros, I
don't think REF ever can be NULL.

 gcc/
 * df.h (DF_REF_REG_USE_P, DF_MWS_REG_USE_P): Remove null checks.


OK.

Ciao!
Steven

Re: [PATCH 0/6] Make df_ref representation more efficient

2014-06-14 Thread Steven Bosscher

On Sat, Jun 14, 2014 at 9:36 PM, Richard Sandiford wrote:
 Using a linked list gives a consistent 2% compile-time improvement for
 fold-const.ii -O0 and ~1% for various -O2 compiles I tried.  The df
 routines do still show up high on the profile though.

Can you explain a bit more about what shows up high?

Ciao!
Steven

Re: [patch i386]: Combine memory and indirect jump

2014-06-11 Thread Steven Bosscher

On Wed, Jun 11, 2014 at 10:32 AM, Kai Tietz wrote:
 this patch adds simple combining of indirect-jumps on memory-address.
 This patch is pretty similar to sibcall-combing.
 ChangeLog

 2014-06-11  Kai Tietz  ktietz@...

 * config/i386/i386.md (peehole2): To combine
 indirect jump with memory.


Likely fixes part of PR39284, xf. https://gcc.gnu.org/PR39284#c12

Ciao!
Steven

Re: [PATCH, loop2_invariant] Pre-check invariants

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:55 AM, Zhenqiang Chen wrote:

 * loop-invariant.c (find_invariant_insn): Skip invariants, which
 can not make a valid insn during replacement in move_invariant_reg.

 --- a/gcc/loop-invariant.c
 +++ b/gcc/loop-invariant.c
 @@ -881,6 +881,35 @@ find_invariant_insn (rtx insn, bool
 always_reached, bool always_executed)
|| HARD_REGISTER_P (dest))
  simple = false;

 +  /* Pre-check candidate to skip the one which can not make a valid insn
 + during move_invariant_reg.  */
 +  if (flag_ira_loop_pressure  df_live  simple
 +   REG_P (dest)  DF_REG_DEF_COUNT (REGNO (dest))  1)

Why only do this with (flag_ira_loop_pressure  df_live)? If the
invariant can't be moved, we should ignore it regardless of whether
register pressure is taken into account.


 +{
 +  df_ref use;
 +  rtx ref;
 +  unsigned int i = REGNO (dest);
 +  struct df_insn_info *insn_info;
 +  df_ref *def_rec;
 +
 +  for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG (use))
 +   {
 + ref = DF_REF_INSN (use);
 + insn_info = DF_INSN_INFO_GET (ref);
 +
 + for (def_rec = DF_INSN_INFO_DEFS (insn_info); *def_rec; def_rec++)
 +   if (DF_REF_REGNO (*def_rec) == i)
 + {
 +   /* Multi definitions at this stage, most likely are due to
 +  instruction constrain, which requires both read and write
 +  on the same register.  Since move_invariant_reg is not
 +  powerful enough to handle such cases, just ignore the INV
 +  and leave the chance to others.  */
 +   return;
 + }
 +   }
 +}
 +
if (!may_assign_reg_p (SET_DEST (set))
|| !check_maybe_invariant (SET_SRC (set)))
  return;


Can you put your new check between may_assign_reg_p (dest) and
check_maybe_invariant? The may_assign_reg_p check is cheap and
triggers quite often.

Looks good to me otherwise.

Ciao!
Steven

Re: [PATCH, loop2_invariant] Skip inv (marked as move) from depends_on

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:32 AM, Zhenqiang Chen wrote:

 * loop-invariant.c (get_inv_cost): Skip invariants, which are marked
 as move, from depends_on.


This is OK.

Ciao!
Steven

Re: [PATCH, loop2_invariant, 1/2] Check only one register class

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:22 AM, Zhenqiang Chen wrote:
 Hi,

 For loop2-invariant pass, when flag_ira_loop_pressure is enabled,
 function gain_for_invariant checks the pressures of all register
 classes. This does not make sense since one invariant might impact
 only one register class.

 The patch enhances functions get_inv_cost and gain_for_invariant to
 check only the register pressure of the invariant if possible.

This patch may work for targets with more-or-less orthogonal reg
classes, but not if there is a lot of overlap between reg classes.

So I don't think this approach is OK.

Ciao!
Steven

Re: [PATCH, loop2_invariant, 2/2] Change heuristics for identical invariants

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:23 AM, Zhenqiang Chen wrote:
 * loop-invariant.c (struct invariant): Add a new member: eqno;
 (find_identical_invariants): Update eqno;
 (create_new_invariant): Init eqno;
 (get_inv_cost): Compute comp_cost wiht eqno;
 (gain_for_invariant): Take spill cost into account.

Look OK except ...

 @@ -1243,7 +1256,13 @@ gain_for_invariant (struct invariant *inv,
 unsigned *regs_needed,
  + IRA_LOOP_RESERVED_REGS
  - ira_class_hard_regs_num[cl];
if (size_cost  0)
 -   return -1;
 +   {
 + int spill_cost = target_spill_cost [speed] * (int) regs_needed[cl];
 + if (comp_cost = spill_cost)
 +   return -1;
 +
 + return 2;
 +   }
else
 size_cost = 0;
  }

... why return 2, instead of just falling through to return
comp_cost - size_cost;?

Ciao!
Steven

Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-07 Thread Steven Bosscher

On Sat, Jun 7, 2014 at 12:59 PM, Eric Botcazou wrote:
 In Ada we don't mark (external) variables as addressable if we don't
 see their address taken.

 You have to (now).  The testing was of course to detect this...

 Well, you need to define what TREE_ADDRESSABLE means now, because according to

 /* In VAR_DECL, PARM_DECL and RESULT_DECL nodes, nonzero means address
of this is needed.  So it cannot be in a register.
 [...]
 #define TREE_ADDRESSABLE(NODE) ((NODE)-base.addressable_flag)

 your change is clearly wrong and the Ada compiler clearly right.

Clearly?

An external variable is a VAR_DECL that cannot be in a register. It
can be loaded into a register (or stored into), and for that its
address is needed. So I would expect an external variable to be marked
addressable by default.

I was always surprised that this was not the case before Richi's change.


  And auditing
 the various front-ends might also be in order here if they really need to mark
 every single external variable as addressable to be safe wrt aliasing.

Right. And this should have been done (clearly ;-) ) before the patch
was committed...

Ciao!
Steven

Re: RFA: Rework FOR_BB_INSNS iterators

2014-06-07 Thread Steven Bosscher

On Sat, Jun 7, 2014 at 7:54 PM, Richard Sandiford wrote:
 The two parts of the loop condition are really handling two different
 kinds of block: ones like entry and exit that are completely empty
 and normal ones that have at least a block note.  There's no real
 need to check for null INSNs in normal blocks.

Block notes should go away some day, they're just remains of a time
when there was no actual CFG in the compiler.


 Also, refetching NEXT_INSN (BB_END (BB)) for each iteration can be expensive.
 If we're prepared to say that the loop body can't insert instructions
 for another block immediately after BB_END,

This can happen with block_label() if e.g. a new jump is inserted
for one reason or another. Not very likely for passes working in
cfglayout mode, but post-RA there may be places that need this
(splitters, peepholes, machine dependent reorgs, etc.).

So even if we're prepared to say what you suggest, I don't think you
can easily enforce it.


 It's easier to change these macros if they define the INSN variables
 themselves.

If you're going to hack these iterators anyway (much appreciated BTW),
I suggest to make them similar to the gsi, loop, edge, and bitmap
iterators: A new insn_iterator structure to hold the variables and
static inline functions wrapped in the macros. This will also be
helpful if (when) we ever manage to make the type for an insn a
non-rtx...



 +/* For iterating over insns in a basic block.  The iterator allows the loop
 +   body to delete INSN.  It also ignores any instructions that the body
 +   inserts between INSN and the following instruction.  */
 +#define FOR_BB_INSNS(BB, INSN) \
 +  for (rtx INSN = BB_HEAD (BB), INSN##_cond_ = INSN, INSN##_next_, \
 +   INSN##_end_ = INSN ? NEXT_INSN (BB_END (BB)) : NULL_RTX;  
   \
 +   INSN##_cond_  (INSN##_next_ = NEXT_INSN (INSN), true);  
   \
 +   INSN = INSN##_next_,\
 +   INSN##_cond_ = (INSN != INSN##_end_ ? (rtx) 1 : NULL_RTX))

This just makes my eyes hurt...


What about cases where a FOR_BB_INSNS is terminated before reaching
the end of a basic block, and you need to know at what insn you
stopped? Up to now, you could do:

  rtx insn; basic_block bb;
  FOR_BB_INSNS (bb, insn)
{
  ... // do stuff
  if (something) break;
}
  do_something_with (insn);

Looks like this is no longer possible with the implementation of
FOR_BB_INSNS of your patch.

I would not approve this patch, but let's wait what others think of it.

Ciao!
Steven

Re: [PATCH] Updates merged bb count

2014-05-30 Thread Steven Bosscher

On Fri, May 30, 2014 at 11:43 PM, Dehao Chen wrote:
 Index: gcc/testsuite/gcc.dg/tree-prof/merge_block.c
 ===
 --- gcc/testsuite/gcc.dg/tree-prof/merge_block.c (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-prof/merge_block.c (revision 0)
 @@ -0,0 +1,20 @@
 +
 +/* { dg-options -O2 -fno-ipa-pure-const
 -fdump-tree-optimized-blocks-details -fno-early-inlining } */
 +int a[8];
 +int t()
 +{
 + int i;
 + for (i = 0; i  3; i++)
 + if (a[i])
 + break;
 + return i;
 +}
 +main ()
 +{
 +  int i;
 +  for (i = 0; i  1000; i++)
 +t ();
 +  return 0;
 +}
 +/* { dg-final-use { scan-tree-dump-not Invalid sum optimized} } */
 +/* { dg-final-use { cleanup-tree-dump optimized } } */


I suppose you want to avoid having t() inlined into main()? If so,
then I'd suggest adding __attribute__((__noinline__,__noclone__)) to
robustify the test case.

Ciao!
Steven

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1008 matches

Mail list logo