On 12/1/19 11:37 PM, Jan Hubicka wrote:
Hi,
I was playing with it a bit more and built with
-fno-profile-reorder-functions.

Here is -fno-profile-reorder-functions compared to first run
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=3d537be0cb37458e7928f69a37efb2a6d6b85eae&newProject=try&newRevision=4543abfa08870391544b56d16dfcd530dac0dc30&framework=1
2.2% improvement on the page rendering is off noise but I would hope for
bit more.

Here is -fno-profile-reorder-functions compared to clustering
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=3d537be0cb37458e7928f69a37efb2a6d6b85eae&newProject=try&newRevision=1c2d53b10b042aaaac15edbe7bd26e2740641840&framework=1
I am not sure if it is a noise.

Either Fireox's talos is not good benchmark for code layout (which I
would be surprised since it is quite sensitive to code size issues) or
there are some problems.

In general I think the patch is useful and mostly mainline ready except
for detailes but it would be good to have some more evidence that it
works as expected on large binaries besides tramp3d build time.  There
are number of ways where things may go wrong ranging from misupdated
profiles, problems with function splitting, comdats and other issues.

Based on my testing, I was able to see cc1plus binary really sorted as seen
by the pass, including various IPA clones that inherited order from their
origins.


Was you able to benchmark some other benefits?

Unfortunately not.

I remember we discussed
collecting traces from valgrind, perhaps we could test that they are
looking good?

I have some semi-working icegrind port here:
https://github.com/marxin/valgrind/tree/icegrind

It will probably need some extra work and it's terribly slow. But you can try
it.

I would first wait for the Firefox dump files and then we can discuss what
to do with the pass.

Martin


Honza


Reply via email to