On Thu, Oct 3, 2013 at 6:41 AM, Teresa Johnson wrote:
> On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>>> 2013-09-29 Teresa Johnson
>>>
>>> * bb-reorder.c
>>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>>> Treat profile insanities conservatively.
>>>
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>> 2013-09-29 Teresa Johnson
>>
>> * bb-reorder.c
>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>> Treat profile insanities conservatively.
>> * predict.c (probably_never_executed): New function. Treat prof
> But why do we want to consider blocks as "probably never executed"
> when the frequency suggests they are sometimes executed?
Well, probably never executed is mean to reffer to one run. If you have
something like code handling fatal errors, you probably still want to have it
in cold secion even
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka wrote:
>> 2013-09-29 Teresa Johnson
>>
>> * bb-reorder.c
>> (find_rarely_executed_basic_blocks_and_crossing_edges):
>> Treat profile insanities conservatively.
>> * predict.c (probably_never_executed): New function. Treat prof
> 2013-09-29 Teresa Johnson
>
> * bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges):
> Treat profile insanities conservatively.
> * predict.c (probably_never_executed): New function. Treat profile
> insanities conservatively.
> (probably
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>>
>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>> case, it was due to expansion which was creating multiple branches/bbs
>>> from a logical OR and guessin
On Fri, Sep 27, 2013 at 7:15 AM, Teresa Johnson wrote:
> On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka wrote:
>>>
>>> Why not just have probably_never_executed_bb_p return simply return
>>> false bb->frequency is non-zero (right now it does the opposite -
>>
>> We want to have frequencies guessed
On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka wrote:
>>
>> Why not just have probably_never_executed_bb_p return simply return
>> false bb->frequency is non-zero (right now it does the opposite -
>
> We want to have frequencies guessed for functions that was not trained
> in the profiling run (that
On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka wrote:
>> > As for COMDAT merging, i would like to see the patch. I am experimenting
>> > now with a patch to also privatize COMDATs during -fprofile-generate to
>> > avoid problems with lost profiles mentioned above.
>> >
>>
>> Do you mean you privati
>
> Why not just have probably_never_executed_bb_p return simply return
> false bb->frequency is non-zero (right now it does the opposite -
We want to have frequencies guessed for functions that was not trained
in the profiling run (that was patch I posted earlier that I think did not
go in, yet)
> > As for COMDAT merging, i would like to see the patch. I am experimenting
> > now with a patch to also privatize COMDATs during -fprofile-generate to
> > avoid problems with lost profiles mentioned above.
> >
>
> Do you mean you privatize every COMDAT function in the profile-generate?
> We dis
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka wrote:
>> Hi Honza,
>>
>> I am finally getting back to working on this after a few weeks of
>> working on some other priorities.
>
> I am also trying to return to this, so good timming ;)
> Martin has got smaller C++ programs (Inkscape) to not touch col
On Wed, Sep 25, 2013 at 2:33 PM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
>> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
I looked at one that failed after 100 as well (20031204-1.c). In this
case, it was due to expansion which was cr
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson wrote:
> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>>
>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>> case, it was due to expansion which was creating multiple branches/bbs
>>> from a logical OR and guessin
Rong - can you answer the questions below on the comdat patch?
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka wrote:
>> Hi Honza,
>>
>> I am finally getting back to working on this after a few weeks of
>> working on some other priorities.
>
> I am also trying to return to this, so good timming ;)
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka wrote:
>>
>> I looked at one that failed after 100 as well (20031204-1.c). In this
>> case, it was due to expansion which was creating multiple branches/bbs
>> from a logical OR and guessing incorrectly on how to assign the
>> counts:
>>
>> if (octets
>
> I looked at one that failed after 100 as well (20031204-1.c). In this
> case, it was due to expansion which was creating multiple branches/bbs
> from a logical OR and guessing incorrectly on how to assign the
> counts:
>
> if (octets == 4 && (*cp == ':' || *cp == '\0')) {
>
> The (*cp == ':
> Hi Honza,
>
> I am finally getting back to working on this after a few weeks of
> working on some other priorities.
I am also trying to return to this, so good timming ;)
Martin has got smaller C++ programs (Inkscape) to not touch cold segment
during the startup with FDO (w/o partitioning). Fir
Hi Honza,
I am finally getting back to working on this after a few weeks of
working on some other priorities.
On Sat, Aug 31, 2013 at 2:46 PM, Jan Hubicka wrote:
> Hi,
> I run my script on execute testsuite and looked into few testcases. The
> problem I found
> was roundoff errors - i.e. when e
Hi,
I run my script on execute testsuite and looked into few testcases. The problem
I found
was roundoff errors - i.e. when expanding switch we set 50% chage that out of
bound
value is above or bellow. Both gets rounded to 0, because switch is executed
once
and the value is bellow.
Partly this
Hi,
With Martin we made script for testing the profiling failures.
First do
ld --verbose >~/script
then apply
--- /home/jh/script22013-08-31 17:59:11.0 +0200
+++ /home/jh/script 2013-08-31 17:39:40.0 +0200
@@ -1,12 +1,3 @@
-GNU ld (GNU Binutils for Debian) 2.20.1-system.2
On Fri, Aug 30, 2013 at 7:50 AM, Teresa Johnson wrote:
> Can someone review and ok the attached patch for trunk? It has been
> bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
> enabling -freorder-blocks-and-partition enabled for a
> profiledbootstrap as well.
>
> (Honza, see mor
On Fri, Aug 30, 2013 at 8:26 AM, Jan Hubicka wrote:
>> >
>> > The frequency condition needs to be done only when you walk predecestors -
>> > when
>> > you walk down the edge probabilities are just fine.
>>
>> True. For simplicity I think it should be fine to leave as-is so there
>> isn't more sp
On Fri, Aug 30, 2013 at 2:31 AM, Jan Hubicka wrote:
>> > The esitmated profile is already there before reading the profile in, so we
>> > only do not want to overwrite it. Does the following work for you?
>>
>> It does get the estimated frequencies on the bbs.
>
> Good.
I ended up making some sl
> >
> > The frequency condition needs to be done only when you walk predecestors -
> > when
> > you walk down the edge probabilities are just fine.
>
> True. For simplicity I think it should be fine to leave as-is so there
> isn't more special casing as the current approach works in both
> direct
Can someone review and ok the attached patch for trunk? It has been
bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
enabling -freorder-blocks-and-partition enabled for a
profiledbootstrap as well.
(Honza, see more responses inlined below. Rong, please see note below as well).
T
> > The esitmated profile is already there before reading the profile in, so we
> > only do not want to overwrite it. Does the following work for you?
>
> It does get the estimated frequencies on the bbs.
Good.
> > We wil also need to solve problem that in this case cgraph edges will have
> >
> Great! Is this the LTO merging you were talking about in an earlier
> message, or the gcov runtime fix (that would presumably not be
> lto-specific)?
It is the LTO path - we need to merge profiles there anyway for his code
unification
work.
> > I have patch to track this. Moreover vforks seem
On Wed, Aug 28, 2013 at 11:20 AM, Teresa Johnson wrote:
> On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka wrote:
>> Hi,
>> with Martin we did bit of progress on analyzing the problems. We now have
>> COMDAT profile merging for FDO
>
> Great! Is this the LTO merging you were talking about in an ear
On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka wrote:
> Hi,
> with Martin we did bit of progress on analyzing the problems. We now have
> COMDAT profile merging for FDO
Great! Is this the LTO merging you were talking about in an earlier
message, or the gcov runtime fix (that would presumably not
Hi,
with Martin we did bit of progress on analyzing the problems. We now have
COMDAT profile merging for FDO and we also noticed that forks can make your
basic blocks appear never executed even though they are executed every run:
the fork is accounted as 3 independnet runs of the program. First r
On Wed, Aug 21, 2013 at 8:25 AM, Jan Hubicka wrote:
>> >
>> > Because offline COMDAT functoin will be porduced for every COMDAT used, I
>> > think
>> > it is bad to porduce any COMDAT (or any reachable function via calls with
>> > non-0
>> > count) that has empty profile (either because it got l
On Mon, Aug 19, 2013 at 10:47 AM, Teresa Johnson wrote:
> On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka wrote:
>>> Remember it isn't using dominance anymore. The latest patch was
>>> instead ensuring the most frequent path between hot blocks and the
>>> entry/exit are marked hot. That should be be
> >
> > Because offline COMDAT functoin will be porduced for every COMDAT used, I
> > think
> > it is bad to porduce any COMDAT (or any reachable function via calls with
> > non-0
> > count) that has empty profile (either because it got lost by COMDAT merging
> > or because of reading mismatch).
Dear Teresa,
On 19 August 2013 19:47, Teresa Johnson wrote:
> On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka wrote:
>>> Remember it isn't using dominance anymore. The latest patch was
>>> instead ensuring the most frequent path between hot blocks and the
>>> entry/exit are marked hot. That should
On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka wrote:
>> Remember it isn't using dominance anymore. The latest patch was
>> instead ensuring the most frequent path between hot blocks and the
>> entry/exit are marked hot. That should be better than the dominance
>> approach used in the earlier versio
On Sat, Aug 17, 2013 at 1:44 PM, Jan Hubicka wrote:
>>
>> patch for updating counts based on estimated frequencies to address
>> inlined comdats with 0 profile counts:
>>
>> 013-08-16 Teresa Johnson
>>
>> * tree-inline.c (copy_bb): Compute count based on frequency.
>> (copy_edge
> Remember it isn't using dominance anymore. The latest patch was
> instead ensuring the most frequent path between hot blocks and the
> entry/exit are marked hot. That should be better than the dominance
> approach used in the earlier version.
Indeed, that looks more resonable approach.
Can you p
On Sat, Aug 17, 2013 at 1:44 PM, Jan Hubicka wrote:
>>
>> I added both of these and ran into issues due to profile maintenance.
>> For example, there were non-zero blocks in the cold section because
>> pro_and_epilogue split a simple return block that was previously reach
>> by both hot and cold p
>
> I added both of these and ran into issues due to profile maintenance.
> For example, there were non-zero blocks in the cold section because
> pro_and_epilogue split a simple return block that was previously reach
> by both hot and cold paths. The new return block that was then only
> reached v
On Fri, Aug 9, 2013 at 2:02 PM, Teresa Johnson wrote:
> On Fri, Aug 9, 2013 at 8:28 AM, Jan Hubicka wrote:
>>> > Do we sanity check that the cold partition does not contain any blocks of
>>> > count 0? It may be that the profile is broken enough to make partitioning
>>> > not work.
>>>
>>> Do yo
Hi,
thinking about it a bit more, I suppose easiest way is to
1) make separate sets of counters for each comdat and place them
into comdat section named as DECL_COMDAT_GROUP (node) + cfg_checksum +
individual_counter_counts.
This will make linker to unify the sections for us.
2) extend API o
> Hello,
>I did a collection of systemtap graphs for GIMP.
>
> All these graphs were created with enabled LTO, profiling and -O2.
>
> 1) gimp-reordered.pdf - function are reordered according to my newly
> created profile that utilizes LTO infrastructure
> 2) gimp-no-top-level-reorder.pdf - (G
Cc'ing Rong since he is also working on trying to address the comdat
profile issue. Rong, you may need to see an earlier message for more
context:
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html
Teresa
On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka wrote:
>>
>> I see, yes LTO can deal with
>
> I see, yes LTO can deal with this better since it has global
> information. In non-LTO mode (including LIPO) we have the issue.
Either Martin or me will implement merging of the multiple copies at
LTO link time. This is needed for Martin's code unification patch anyway.
Theoretically gcov r
>
> I see, yes LTO can deal with this better since it has global
> information. In non-LTO mode (including LIPO) we have the issue.
Thinking about it, there is still one problem left: I usually suggest
users to train with -fno-lto to avoid excessive linking time with
instrumentation. This actual
On Fri, Aug 9, 2013 at 8:28 AM, Jan Hubicka wrote:
>> > Do we sanity check that the cold partition does not contain any blocks of
>> > count 0? It may be that the profile is broken enough to make partitioning
>> > not work.
>>
>> Do you mean sanity check that the cold partition does not contain a
On Fri, Aug 9, 2013 at 8:54 AM, Martin Liška wrote:
> Hi
>
> On 9 August 2013 17:28, Jan Hubicka wrote:
>>> > Do we sanity check that the cold partition does not contain any blocks of
>>> > count 0? It may be that the profile is broken enough to make partitioning
>>> > not work.
>>>
>>> Do you m
Hi
On 9 August 2013 17:28, Jan Hubicka wrote:
>> > Do we sanity check that the cold partition does not contain any blocks of
>> > count 0? It may be that the profile is broken enough to make partitioning
>> > not work.
>>
>> Do you mean sanity check that the cold partition does not contain any
>
> > Do we sanity check that the cold partition does not contain any blocks of
> > count 0? It may be that the profile is broken enough to make partitioning
> > not work.
>
> Do you mean sanity check that the cold partition does not contain any
> blocks of count > 0? (they should all be zero) I do
On Fri, Aug 9, 2013 at 2:58 AM, Jan Hubicka wrote:
>> On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka wrote:
>> > Hi,
>> > Martin Liska was kind enough to generate disk seeking graph of gimp
>> > statrup with his function reordering.
>> > His code simply measures time of firest execution of a functi
> On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka wrote:
> > Hi,
> > Martin Liska was kind enough to generate disk seeking graph of gimp statrup
> > with his function reordering.
> > His code simply measures time of firest execution of a function and orders
> > functions in the given order.
> > The
On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka wrote:
> Hi,
> Martin Liska was kind enough to generate disk seeking graph of gimp statrup
> with his function reordering.
> His code simply measures time of firest execution of a function and orders
> functions in the given order.
> The functions stay
On Mon, Aug 5, 2013 at 7:57 AM, Teresa Johnson wrote:
> On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka wrote:
>> The patch looks OK to me in general (I can not approve it).
>> Still have one question...
>>> +
>>> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
>>> + post-domina
On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka wrote:
> The patch looks OK to me in general (I can not approve it).
> Still have one question...
>> +
>> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
>> + post-dominance DIR, for example as a result of edge weight insanities.
>>
The patch looks OK to me in general (I can not approve it).
Still have one question...
> +
> +/* Ensure that no cold bbs dominate hot bbs along the dominance or
> + post-dominance DIR, for example as a result of edge weight insanities.
> + Returns the updated value of COLD_BB_COUNT and adds new
On Fri, Aug 2, 2013 at 9:48 PM, Teresa Johnson wrote:
> On Fri, Aug 2, 2013 at 8:05 AM, Jan Hubicka wrote:
>>>
>>> 2013-08-01 Teresa Johnson
>>> Steven Bosscher
>>>
>>> * cfgrtl.c (fixup_bb_partition): New routine.
>>> (commit_edge_insertions): Invoke fixup_partit
On Fri, Aug 2, 2013 at 4:04 PM, Steven Bosscher wrote:
> On Fri, Aug 2, 2013 at 5:05 PM, Jan Hubicka wrote:
>>> +/* Called when block BB has been reassigned to a different partition,
>>> + to ensure that the region crossing attributes are updated. */
>>> +
>>> +static void
>>> +fixup_bb_partit
On Fri, Aug 2, 2013 at 8:05 AM, Jan Hubicka wrote:
>>
>> 2013-08-01 Teresa Johnson
>> Steven Bosscher
>>
>> * cfgrtl.c (fixup_bb_partition): New routine.
>> (commit_edge_insertions): Invoke fixup_partitions.
>> (find_partition_fixes): New routine.
>>
On Fri, Aug 2, 2013 at 5:05 PM, Jan Hubicka wrote:
>> +/* Called when block BB has been reassigned to a different partition,
>> + to ensure that the region crossing attributes are updated. */
>> +
>> +static void
>> +fixup_bb_partition (basic_block bb)
>> +{
>> + edge e;
>> + edge_iterator ei
>
> 2013-08-01 Teresa Johnson
> Steven Bosscher
>
> * cfgrtl.c (fixup_bb_partition): New routine.
> (commit_edge_insertions): Invoke fixup_partitions.
> (find_partition_fixes): New routine.
> (fixup_partitions): Ditto.
> (verify_hot_cold_bl
On Fri, Aug 2, 2013 at 4:22 AM, Bernhard Reutner-Fischer
wrote:
> On 1 August 2013 18:32, Teresa Johnson wrote:
>> Patch 3 of 3 split out from the patch I sent in May that fixes problems with
>> -freorder-blocks-and-partition, with changes/fixes discussed in that thread.
>>
>> See http://gcc.gnu.
On 1 August 2013 18:32, Teresa Johnson wrote:
> Patch 3 of 3 split out from the patch I sent in May that fixes problems with
> -freorder-blocks-and-partition, with changes/fixes discussed in that thread.
>
> See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.
>
> This pat
Patch 3 of 3 split out from the patch I sent in May that fixes problems with
-freorder-blocks-and-partition, with changes/fixes discussed in that thread.
See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.
This patch sanitizes the partitioning to address issues such as e
64 matches
Mail list logo