Hi,

Basic block reordering is disabled for -Os from gcc 4.7 since the pass will
lead to big code size regression. But benchmarks logs also show there are
lots of regression due to poor code layout compared with 4.6.

The patch is to enable bbro for -Os. When optimizing for size, it
* avoid duplicating block.
* keep its original order if there is no chance to fall through.
* ignore edge frequency and probability.
* handle predecessor first if its index is smaller to break long trace.
* only connect Trace n with Trace n + 1 to reduce long jump.

Here are the CSiBE code size benchmark results:
* For ARM, code size reduces 0.21%.
* For MIPS, code size reduces 0.25%.
* For PPC, code size reduces 0.33%.
* For X86, code size reduces 0.22%.

The patch does not impact bbro when optimizing for speed. To verify it, I
"objdump -d" all obj files from CSiBE (compiled with -O2) for
ARM/MIPS/PPC/X86. The assembler with the patch is the same as it without the
patch.

No make check regression on ARM.

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog
2012-08-14  Zhenqiang Chen <zhenqiang.c...@arm.com>

        * bb-reorder.c (connect_better_edge_p): New added.
        (find_traces_1_round): When optimizing for size, ignore edge
frequency
        and probability, and handle all in one round.
        (bb_to_key): Use bb->index as key for size.
        (better_edge_p): The smaller bb index is better for size.
        (connect_traces): Connect block n with block n + 1;
        connect trace m with trace m + 1 if falling through.
        (copy_bb_p): Avoid duplicating blocks.
        (gate_handle_reorder_blocks): Enable bbro when optimizing for -Os.

Attachment: Enable-bbro-for-size.patch
Description: Binary data

Reply via email to