On Tue, Nov 19, 2013 at 7:44 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
> Martin,
> can you, please, generate the updated systemtap with
> -freorder-blocks-and-partition enabled?
>
> I am in favour of enabling this - it is usefull pass and it is pointless ot
> have passes that are not enabled by default.
> Is there reason why this would not work on other ELF target? Is it working
> with Darwin and Windows?

I don't know how to test these (I don't see any machines listed in the
gcc compile farm of those types). For Windows, I assume you mean
MinGW, which should be enabled as it is under i386. Should I disable
it there and for Darwin?

>
>> This patch enables -freorder-blocks-and-partition by default for x86
>> at -O2 and up. It is showing some modest gains in cpu2006 performance
>> with profile feedback and -O2 on an Intel Westmere system. Specifically,
>> I am seeing consistent improvements in 401.bzip2 (1.5-3%), 483.xalancbmk
>> (1.5-3%), and 453.povray (2.5-3%), and no apparent regressions.
>
> This actually sounds very good ;)
>
> Lets see how the systemtap graphs goes.  If we will end up with problem
> of too many accesses to cold section, I would suggest making cold section
> subdivided into .unlikely and .unlikely.part (we could have better name)
> with the second consisting only of unlikely parts of hot&normal functions.
>
> This should reduce the problems we are seeing with mistakely identifying
> code to be cold because of roundoff errors (and it probably makes sense
> in general, too).
> We will however need to update gold and ld for that.

Note that I don't think this would help much unless the linker is
changed to move the cold split section close to the hot section. There
is probably some fine-tuning we could do eventually in the linker
under -ffunction-sections without putting the split portions in a
separate section. I.e. clump the split parts together within unlikely.
But hopefully this can all be done later on as follow-on work to boost
the performance further.

>>
>> Bootstrapped and tested on x86-64-unknown-linux-gnu with a normal
>> bootstrap, a profiledbootstrap and an LTO profiledbootstrap. All were
>> configured with --enable-languages=all,obj-c++ and tested for both
>> 32 and 64-bit with RUNTESTFLAGS="--target_board=unix\{-m32,-m64\}".
>>
>> It would be good to enable this for additional targets as a follow on,
>> but it needs more testing for both correctness and performance on those
>> other targets (i.e for correctness because I see a number of places
>> in other config/*/*.c files that do some special handling under this
>> option for different targets or simply disable it, so I am not sure
>> how well-tested it is under different architectural constraints).
>>
>> Ok for trunk?
>>
>> Thanks,
>> Teresa
>>
>> 2013-11-19  Teresa Johnson  <tejohn...@google.com>
>>
>>         * common/config/i386/i386-common.c: Enable
>>         -freorder-blocks-and-partition at -O2 and up for x86.
>>         * opts.c (finish_options): Only warn if -freorder-blocks-and-
>>         partition was set on command line.
>
> You probably mis doc/invoke.texi update.
> Thank you for working on this!

Yes, thanks. Here is the patch with the invoke.texi update.

Teresa


2013-11-19  Teresa Johnson  <tejohn...@google.com>

        * common/config/i386/i386-common.c: Enable
        -freorder-blocks-and-partition at -O2 and up for x86.
        * doc/invoke.texi: Update -freorder-blocks-and-partition default.
        * opts.c (finish_options): Only warn if -freorder-blocks-and-
        partition was set on command line.

Index: common/config/i386/i386-common.c
===================================================================
--- common/config/i386/i386-common.c    (revision 205001)
+++ common/config/i386/i386-common.c    (working copy)
@@ -789,6 +789,8 @@ static const struct default_options ix86_option_op
   {
     /* Enable redundant extension instructions removal at -O2 and higher.  */
     { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
+    /* Enable function splitting at -O2 and higher.  */
+    { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_and_partition, NULL, 1 },
     /* Turn off -fschedule-insns by default.  It tends to make the
        problem with not enough registers even worse.  */
     { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 },
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi     (revision 205001)
+++ doc/invoke.texi     (working copy)
@@ -8172,6 +8172,8 @@ exception handling, for linkonce sections, for fun
 section attribute and on any architecture that does not support named
 sections.

+Enabled for x86 at levels @option{-O2}, @option{-O3}.
+
 @item -freorder-functions
 @opindex freorder-functions
 Reorder functions in the object file in order to
Index: opts.c
===================================================================
--- opts.c      (revision 205001)
+++ opts.c      (working copy)
@@ -737,9 +737,10 @@ finish_options (struct gcc_options *opts, struct g
       && opts->x_flag_reorder_blocks_and_partition
       && (ui_except == UI_SJLJ || ui_except >= UI_TARGET))
     {
-      inform (loc,
-             "-freorder-blocks-and-partition does not work "
-             "with exceptions on this architecture");
+      if (opts_set->x_flag_reorder_blocks_and_partition)
+        inform (loc,
+                "-freorder-blocks-and-partition does not work "
+                "with exceptions on this architecture");
       opts->x_flag_reorder_blocks_and_partition = 0;
       opts->x_flag_reorder_blocks = 1;
     }
@@ -752,9 +753,10 @@ finish_options (struct gcc_options *opts, struct g
       && opts->x_flag_reorder_blocks_and_partition
       && (ui_except == UI_SJLJ || ui_except >= UI_TARGET))
     {
-      inform (loc,
-             "-freorder-blocks-and-partition does not support "
-             "unwind info on this architecture");
+      if (opts_set->x_flag_reorder_blocks_and_partition)
+        inform (loc,
+                "-freorder-blocks-and-partition does not support "
+                "unwind info on this architecture");
       opts->x_flag_reorder_blocks_and_partition = 0;
       opts->x_flag_reorder_blocks = 1;
     }
@@ -769,9 +771,10 @@ finish_options (struct gcc_options *opts, struct g
              && targetm_common.unwind_tables_default
              && (ui_except == UI_SJLJ || ui_except >= UI_TARGET))))
     {
-      inform (loc,
-             "-freorder-blocks-and-partition does not work "
-             "on this architecture");
+      if (opts_set->x_flag_reorder_blocks_and_partition)
+        inform (loc,
+                "-freorder-blocks-and-partition does not work "
+                "on this architecture");
       opts->x_flag_reorder_blocks_and_partition = 0;
       opts->x_flag_reorder_blocks = 1;
     }


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Reply via email to