Bugzilla 41004 calls for a more -Os-friendly algorithm for BB Reorder,
and I'm hoping I can meet this challenge.  But I need your help!

If you have any existing ideas or thoughts that could help me get
closer to a sensible heuristic sooner, then please post them to this
list.

In the mean time, here's my progress so far:

My first thought was to limit BB Reorder to hot functions, as identified
by gcov, so we could get maximum execution time benefit for minimised
code
size impact.  Based on benchmarking I've done so far, this looks to be a
winner, but it only works when profile data is available.

Without profile data, we face two main problems:

1) We cannot easily estimate how many times we will execute our more
efficient code-layout, so we can't do an accurate trade-off versus the
code size increase.

2) The traces that BB Reorder constructs will be based on static branch
prediction, rather than true dynamic flows of execution, so the new
layout may not be the best one in practice.

We can address #1 by tagging functions as hot (using attributes), but
that may not always be possible and it does not guarantee that we will
get minimal codesize increases, which is the main aim of this work.

I'm not sure how #2 can be addressed, so I'm planning to sidestep it
completely, since the problem isn't really the performance pay-off but
the codesize increase that usually comes with each new layout of a
function that BB Reorder makes.

My current plan is to characterise a function within find_traces()
(looking at things like the number of traces, edge probabilities and
frequencies, etc) and only call connect_traces() to effect the
reordering
change if these characteristics suggest that minimal code disruption
will
occur and/or maximum performance pay-off.

Thanks for reading and I look forward to your input!

Cheers,
Ian



Reply via email to