Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-11 Thread Andi Kleen
Martin Liška writes: > > Notes and limitations: > - The call-chain-clustering algorithm requires to fit as many as possible > functions into page size (4K). > Based on my measurements that should correspond to ~1000 GIMPLE statements > (IPA inliner size). I can > make it a param in the

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-10 Thread Jan Hubicka
> * Makefile.in: Add ipa-reorder.o. > * cgraph.c (cgraph_node::dump): Dump > text_sorted_order. > (cgraph_node_cmp_by_text_sorted): > New function that sorts functions based > on text_sorted_order. > * cgraph.h (cgraph_node): Add text_sorted_order. >

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Martin Liška
Hi. I've just updated the script a bit and I added also address histogram: https://drive.google.com/file/d/11s9R_JnEMohDE6ctqzsj092QD22HKXJI/view?usp=sharing Martin

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Jan Hubicka
> > On the first glance the difference between gcc9 and gcc10 is explained > > by the changes to profile updating. gcc9 makes very small cold > > partitions compared to gcc10. It is very nice that we have a way to > > measure it. I will also check if some of the more important profiling > >

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Martin Liška
On 12/9/19 2:03 PM, Jan Hubicka wrote: On 12/9/19 1:14 PM, Martin Liška wrote: Hello. Based on presentation that had Sriraman Tallam at a LLVM conference: https://www.youtube.com/watch?v=DySuXFGmB40 I made a heatmap based on executed instruction addresses. I used $ perf record -F max --

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Jan Hubicka
> On 12/9/19 1:14 PM, Martin Liška wrote: > > Hello. > > > > Based on presentation that had Sriraman Tallam at a LLVM conference: > > https://www.youtube.com/watch?v=DySuXFGmB40 > > > > I made a heatmap based on executed instruction addresses. I used > > $ perf record -F max -- ./cc1plus

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Martin Liška
On 12/9/19 1:14 PM, Martin Liška wrote: Hello. Based on presentation that had Sriraman Tallam at a LLVM conference: https://www.youtube.com/watch?v=DySuXFGmB40 I made a heatmap based on executed instruction addresses. I used $ perf record -F max -- ./cc1plus -fpreprocessed

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-09 Thread Martin Liška
Hello. Based on presentation that had Sriraman Tallam at a LLVM conference: https://www.youtube.com/watch?v=DySuXFGmB40 I made a heatmap based on executed instruction addresses. I used $ perf record -F max -- ./cc1plus -fpreprocessed /home/marxin/Programming/tramp3d/tramp3d-v4.ii and $ perf

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-02 Thread Martin Liška
On 12/1/19 11:37 PM, Jan Hubicka wrote: Hi, I was playing with it a bit more and built with -fno-profile-reorder-functions. Here is -fno-profile-reorder-functions compared to first run

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-02 Thread Martin Liška
On 12/1/19 11:45 AM, Jan Hubicka wrote: Hi. I'm sending v3 of the patch where I changed: - function.cold sections are properly put into .text.unlikely and not into a .text.sorted.XYZ section I've just finished measurements and I still have the original speed up for tramp3d: Total runs: 10,

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-02 Thread Martin Liška
On 12/1/19 12:11 PM, Jan Hubicka wrote: Hi. I'm sending v3 of the patch where I changed: - function.cold sections are properly put into .text.unlikely and not into a .text.sorted.XYZ section I've just finished measurements and I still have the original speed up for tramp3d: Total runs: 10,

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-01 Thread Jan Hubicka
Hi, I was playing with it a bit more and built with -fno-profile-reorder-functions. Here is -fno-profile-reorder-functions compared to first run

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-01 Thread Jan Hubicka
> > Hi. > > > > I'm sending v3 of the patch where I changed: > > - function.cold sections are properly put into .text.unlikely and > > not into a .text.sorted.XYZ section > > > > I've just finished measurements and I still have the original speed up > > for tramp3d: > > Total runs: 10, before:

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-01 Thread Jan Hubicka
> Hi. > > I'm sending v3 of the patch where I changed: > - function.cold sections are properly put into .text.unlikely and > not into a .text.sorted.XYZ section > > I've just finished measurements and I still have the original speed up > for tramp3d: > Total runs: 10, before: 13.92, after:

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-11-29 Thread Martin Liška
Hi. I'm sending v3 of the patch where I changed: - function.cold sections are properly put into .text.unlikely and not into a .text.sorted.XYZ section I've just finished measurements and I still have the original speed up for tramp3d: Total runs: 10, before: 13.92, after: 13.82, cmp: 99.219%

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-11-25 Thread Martin Liška
Hello. I'm sending v2 of the patch set based on the discussion I had with Honza. Changes from previous version: - I changed type of edge count from uint32_t to uint64_t. - The algorithm traverses recursively inline clones. - TDF_DUMP_DETAILS is supported and provides more information. - I added

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-07 Thread Jan Hubicka
> > This is why we currently have way to order function when outputting them > > and use that with FDO (using Martin's first execution logic). This has > > drwarback of making the functions to flow in that order through late > > optimizations and RTL backend and thus we lose IPA-RA and some > > IP

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-07 Thread Richard Biener
On Sun, Oct 6, 2019 at 4:38 PM Jan Hubicka wrote: > > > On 9/19/19 2:33 AM, Martin Liška wrote: > > > Hi. > > > > > > Function reordering has been around for quite some time and a naive > > > implementation was also part of my diploma thesis some time ago. > > > Currently, the GCC can reorder

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-06 Thread Jan Hubicka
> On 9/19/19 2:33 AM, Martin Liška wrote: > > Hi. > > > > Function reordering has been around for quite some time and a naive > > implementation was also part of my diploma thesis some time ago. > > Currently, the GCC can reorder function based on first execution, which > > happens with PGO and

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-04 Thread Jeff Law
On 9/19/19 2:33 AM, Martin Liška wrote: > Hi. > > Function reordering has been around for quite some time and a naive > implementation was also part of my diploma thesis some time ago. > Currently, the GCC can reorder function based on first execution, which > happens with PGO and LTO of course.

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-03 Thread Andrew Pinski
On Thu, Oct 3, 2019 at 12:46 AM Fangrui Song wrote: > > > On 2019-10-03, Andrew Pinski wrote: > >On Wed, Oct 2, 2019@9:52 PM Fangrui Song wrote: > >> > >> On 2019-09-24, Martin Liška wrote: > >> >On 9/19/19 10:33 AM, Martin Liška wrote: > >> >> - One needs modified binutils and I that would

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-03 Thread Fangrui Song
On 2019-10-03, Andrew Pinski wrote: On Wed, Oct 2, 2019@9:52 PM Fangrui Song wrote: On 2019-09-24, Martin Liška wrote: >On 9/19/19 10:33 AM, Martin Liška wrote: >> - One needs modified binutils and I that would probably require a configure detection. The only way >> which I see is based

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-03 Thread Andrew Pinski
On Wed, Oct 2, 2019 at 9:52 PM Fangrui Song wrote: > > On 2019-09-24, Martin Liška wrote: > >On 9/19/19 10:33 AM, Martin Liška wrote: > >> - One needs modified binutils and I that would probably require a > >> configure detection. The only way > >> which I see is based on ld --version. I'm

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-10-02 Thread Fangrui Song
On 2019-09-24, Martin Liška wrote: On 9/19/19 10:33 AM, Martin Liška wrote: - One needs modified binutils and I that would probably require a configure detection. The only way which I see is based on ld --version. I'm planning to make the binutils submission soon. The patch submission

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-09-26 Thread Martin Liška
On 9/25/19 6:36 PM, Evgeny Kudryashov wrote: > On 2019-09-19 11:33, Martin Liška wrote: >> Hi. >> >> Function reordering has been around for quite some time and a naive >> implementation was also part of my diploma thesis some time ago. >> Currently, the GCC can reorder function based on first

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-09-25 Thread Evgeny Kudryashov
On 2019-09-19 11:33, Martin Liška wrote: Hi. Function reordering has been around for quite some time and a naive implementation was also part of my diploma thesis some time ago. Currently, the GCC can reorder function based on first execution, which happens with PGO and LTO of course. Known

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-09-24 Thread Martin Liška
On 9/19/19 10:33 AM, Martin Liška wrote: > - One needs modified binutils and I that would probably require a configure > detection. The only way > which I see is based on ld --version. I'm planning to make the binutils > submission soon. The patch submission link:

[PATCH][RFC] Add new ipa-reorder pass

2019-09-19 Thread Martin Liška
Hi. Function reordering has been around for quite some time and a naive implementation was also part of my diploma thesis some time ago. Currently, the GCC can reorder function based on first execution, which happens with PGO and LTO of course. Known limitation is that the order is preserved only