On 10/29/13 09:12, Vladimir Makarov wrote:
Tomorrow I'd like commit the following patch.
The patch removes regmove pass.
I'm buying next time we get together :-) You have my eternal gratitude.
I've found only one useful transformations in regmove pass:
dst = src dst = src (src dies)
... no dst or src modification => src changed on dst
src dies ...
It is some kind value numbering technique decreasing register
pressure by removing one live-range. It is not frequently
triggered transformation (about 30 in all combine.c) and its
effect is quite small but there is no harm in it at all too.
So I added the code to IRA without changes practically (it would
be nice to make this in more general way, e.g. not only in BB
scope -- but I am not sure that it will makes visible
improvements and I have no time to try this currently).
I wouldn't be concerned about generalizing this too much. I can't see
how it's *that* important, and we can always come back to it if we
decide it is important.
Still to achieve the same code performance without regmove pass, I
needed to improve code in IRA which implicitly replace removed regmove
transformations:
o improving hard reg preferences. As you know RTL code can contain
explicitly hard regs. Currently, hard register in RTL code affect
costs of hard regs only for pseudo involved with the hard register
occurrence. E.g.
[ ... ]
This sounds good and generally useful.
o improving preference propagation of hard registers occurring in RTL
and assigned to connected pseudo. Let us look at the situation:
p1 - p2 - p3, where '-' means a copy
and we are assigning p1, p2, p3 in the same order.
When we've just assigned hr1 to p1, we propagating hr1 preference
to p2 and p3. When we assign to p2, we can not assign hr1 for
some reason and have to assign hr2. P3 in the current preference
implementation still has hr1 preference which is wrong.
I implemented undoing preference propagation for such situations.
Ouch. Again, sounds good to get this fixed.
o Currently IRA generates too aggressively copies for operands might
be matched, so I rewrite this code to generate copies more
accurately.
Also good :-)
The changes in testsuites are necessary as IRA/LRA now generate a
different code (more accurately a better code by removing register
shuffle moves for each case).
Excellent.
So this patch removes a lot of code, decrease compilation time
(e.g. valgrind lackey reports about 0.4% less executed insns on
compiling GCC combine.i with -O2), generates about the same performace
code (the best improvement I saw is 0.5% SPEC2000
improvement on x86_64 in -O3 mode on a Haswell processor) and about
the same average code size for SPEC2000 (the differences in hundredth
percent range).
No concerns here.
It is a big change and I hope there are no serious objections to
this. If somebody has them, please express them or inform me.
Thanks, Vlad.
2013-10-28 Vladimir Makarov <vmaka...@redhat.com>
* regmove.c: Remove.
* tree-pass.h (make_pass_regmove): Remove.
* timevar.def (TV_REGMOVE): Remove.
* passes.def (pass_regmove): Remove.
* opts.c (default_options_table): Remove entry for regmove.
* doc/passes.texi: Remove regmove pass description.
* doc/invoke.texi (-foptimize-register-move, -fregmove): Remove
options.
(-fdump-rtl-regmove): Ditto.
* common.opt (foptimize-register-move, fregmove): Remove.
* Makefile.in (OBJS): Remove regmove.o.
* regmove.c: Remove.
* ira-int.h (struct ira_allocno_pref, ira_pref_t): New structure
and type.
(struct ira_allocno) New member allocno_prefs.
(ALLOCNO_PREFS): New macro.
(ira_prefs, ira_prefs_num): New external vars.
(ira_setup_alts, ira_get_dup_out_num, ira_debug_pref): New
prototypes.
(ira_debug_prefs, ira_debug_allocno_prefs, ira_create_pref):
Ditto.
(ira_add_allocno_pref, ira_remove_pref, ira_remove_allocno_prefs):
Ditto.
(ira_add_allocno_copy_to_list): Remove prototype.
(ira_swap_allocno_copy_ends_if_necessary): Ditto.
(ira_pref_iterator): New type.
(ira_pref_iter_init, ira_pref_iter_cond): New functions.
(FOR_EACH_PREF): New macro.
* ira.c (commutative_constraint_p): Move from ira-conflicts.c.
(ira_get_dup_out_num): Ditto. Rename from get_dup_num. Modify the
code.
(ira_setup_alts): New function.
(decrease_live_ranges_number): New function.
(ira): Call the above function.
* ira-build.c (ira_prefs, ira_prefs_num): New global vars.
(ira_create_allocno): Initialize allocno prefs.
(pref_pool, pref_vec): New static vars.
(initiate_prefs, find_allocno_pref, ira_create_pref): New
functions.
(add_allocno_pref_to_list, ira_add_allocno_pref, print_pref): Ditto.
(ira_debug_pref, print_prefs, ira_debug_prefs): Ditto.
(print_allocno_prefs, ira_debug_allocno_prefs, finish_pref): Ditto.
(ira_remove_pref, ira_remove_allocno_prefs, finish_prefs): Ditto.
(ira_add_allocno_copy_to_list): Make static. Rename to
add_allocno_copy_to_list.
(ira_swap_allocno_copy_ends_if_necessary): Make static. Rename to
swap_allocno_copy_ends_if_necessary.
(remove_unnecessary_allocnos, remove_low_level_allocnos): Call
ira_remove_allocno_prefs.
(ira_flattening): Ditto.
(ira_build): Call initiate_prefs, print_prefs.
(ira_destroy): Call finish_prefs.
* ira-color.c (struct update_cost_record): New.
(struct allocno_color_data): Add new member update_cost_records.
(update_cost_record_pool): New static var.
(init_update_cost_records, get_update_cost_record): New functions.
(free_update_cost_record_list, finish_update_cost_records): Ditto.
(struct update_cost_queue_elem): Add member from.
(initiate_cost_update): Call init_update_cost_records.
(finish_cost_update): Call finish_update_cost_records.
(queue_update_cost, get_next_update_cost): Add new param from.
(update_allocno_cost, update_costs_from_allocno): New functions.
(update_costs_from_prefs): Ditto.
(update_copy_costs): Rename to update_costs_from_copies.
(restore_costs_from_copies): New function.
(update_conflict_hard_regno_costs): Don't go back.
(assign_hard_reg): Call restore_costs_from_copies. Add printing
more debug info.
(pop_allocnos): Add priniting more debug info.
(color_allocnos): Remove prefs for conflicting hard regs.
Call update_costs_from_prefs.
* ira-conflicts.c (commutative_constraint_p): Move to ira.c
(get_dup_num): Rename, modify, and move to ira.c
(process_regs_for_copy): Add prefs.
(add_insn_allocno_copies): Put src as first arg of
process_regs_for_copy. Remove dead code. Call ira_setup_alts.
* ira-costs.c (record_reg_classes): Modify and move code into
record_operands_costs.
(find_costs_and_classes): Create prefs for the hard reg of small
reg class.
I can't think of any reason not to move forward. There may be some
fallout (can anyone say SH?), but getting the bits in sooner rather than
later means we get a head start on addressing those issues.
Jeff