Re: Do not imply -fweb with -fpeel-loops
> > > > In tree-SSA world most of reuse is avoided by out-of-ssa pass. Random reuse > > these days is limited to loop unrolling and RTL expansion I think. Without > > unroling webizer renames 204 pseudos on tramp3d that seems close to 0. I can > > try to dig into that. With -funroll-all-loops it renames 17309 registers. > > Of course it would be possible to write unroller specific renamer, but it > > would > > hardly be any wasier than webizer. > > And we have things like -fsplit-ivs-in-unroller that does part of it... That is orthogonal to splitting done by web pass though. On Terbium I see two perofmrance change today which is probably for turning off the webizer again. It is Eon 975->955 and Mesa 810->778. This is w/o unrolling so not that many splits are triggered. I will collect the data with unrolling. Honza > > Richard. > > > Honza
Re: Do not imply -fweb with -fpeel-loops
If the decision is taken to remove -fweb, please give me a heads-up as I have a target-specific pass that shares infrastructure with it. I can factor that out to facilitate the removal; just let me know. Thanks! -- Bill Bill Schmidt, Ph.D. GCC for LInux on Power Linux on Power Toolchain IBM Linux Technology Center wschm...@linux.vnet.ibm.com > On May 31, 2016, at 8:50 AM, Richard Biener> wrote: > > On Tue, May 31, 2016 at 1:42 PM, Jan Hubicka wrote: I would say we should handle -frename-registers same way as we do -fschedule-insns. I.e. enable/disable it on target basis rather than budnle it to loop unrolling. But that is for future patch. I will commit this and propose patch making -fpeel-loops independent of rename-registers next >>> >>> Ok. Does -fweb still do what its documentation says (enable register >>> allocation to >>> work on pseudos directly)? Given its downside and strong GIMPLE >>> optimizations maybe >>> it is time to remove it? >> >> I will wait with comitting the renaming patch and we will see the effect with >> -fno-unroll-loops on ia64 tester (it probably pays back more on in-order >> target) and we can test effect with -funroll-loops by patching it tomorrow. >> >> I guess IRA does live range pslitting by itself. Register allocator is not >> the >> only pass which does not like reuse of pseudo. Web originally also made >> -fschedule-insns (not -fschedule-insns2) and CSE to work harder. >> >> In tree-SSA world most of reuse is avoided by out-of-ssa pass. Random reuse >> these days is limited to loop unrolling and RTL expansion I think. Without >> unroling webizer renames 204 pseudos on tramp3d that seems close to 0. I can >> try to dig into that. With -funroll-all-loops it renames 17309 registers. >> Of course it would be possible to write unroller specific renamer, but it >> would >> hardly be any wasier than webizer. > > And we have things like -fsplit-ivs-in-unroller that does part of it... > > Richard. > >> Honza >
Re: Do not imply -fweb with -fpeel-loops
On Tue, May 31, 2016 at 1:42 PM, Jan Hubickawrote: >> > I would say we should handle -frename-registers same way as we do >> > -fschedule-insns. >> > I.e. enable/disable it on target basis rather than budnle it to loop >> > unrolling. >> > But that is for future patch. I will commit this and propose patch making >> > -fpeel-loops >> > independent of rename-registers next >> >> Ok. Does -fweb still do what its documentation says (enable register >> allocation to >> work on pseudos directly)? Given its downside and strong GIMPLE >> optimizations maybe >> it is time to remove it? > > I will wait with comitting the renaming patch and we will see the effect with > -fno-unroll-loops on ia64 tester (it probably pays back more on in-order > target) and we can test effect with -funroll-loops by patching it tomorrow. > > I guess IRA does live range pslitting by itself. Register allocator is not > the > only pass which does not like reuse of pseudo. Web originally also made > -fschedule-insns (not -fschedule-insns2) and CSE to work harder. > > In tree-SSA world most of reuse is avoided by out-of-ssa pass. Random reuse > these days is limited to loop unrolling and RTL expansion I think. Without > unroling webizer renames 204 pseudos on tramp3d that seems close to 0. I can > try to dig into that. With -funroll-all-loops it renames 17309 registers. > Of course it would be possible to write unroller specific renamer, but it > would > hardly be any wasier than webizer. And we have things like -fsplit-ivs-in-unroller that does part of it... Richard. > Honza
Re: Do not imply -fweb with -fpeel-loops
> > I would say we should handle -frename-registers same way as we do > > -fschedule-insns. > > I.e. enable/disable it on target basis rather than budnle it to loop > > unrolling. > > But that is for future patch. I will commit this and propose patch making > > -fpeel-loops > > independent of rename-registers next > > Ok. Does -fweb still do what its documentation says (enable register > allocation to > work on pseudos directly)? Given its downside and strong GIMPLE > optimizations maybe > it is time to remove it? I will wait with comitting the renaming patch and we will see the effect with -fno-unroll-loops on ia64 tester (it probably pays back more on in-order target) and we can test effect with -funroll-loops by patching it tomorrow. I guess IRA does live range pslitting by itself. Register allocator is not the only pass which does not like reuse of pseudo. Web originally also made -fschedule-insns (not -fschedule-insns2) and CSE to work harder. In tree-SSA world most of reuse is avoided by out-of-ssa pass. Random reuse these days is limited to loop unrolling and RTL expansion I think. Without unroling webizer renames 204 pseudos on tramp3d that seems close to 0. I can try to dig into that. With -funroll-all-loops it renames 17309 registers. Of course it would be possible to write unroller specific renamer, but it would hardly be any wasier than webizer. Honza
Re: Do not imply -fweb with -fpeel-loops
On Tue, May 31, 2016 at 12:49 PM, Jan Hubickawrote: > Hi, > enabling -fpeel-loops with -O3 and -Ofast had unexpected effect of enabling > -fweb and -frename-registers. This patch drop -fweb from list of passes that > are implied by -fpeel-loops because it is no longer needed since we do peeling > at tree instead of RTL and the out-of-ssa pass cares about this. > > I will look into rename registers separately. My understanding is that rename > registers is not really that specific for -funroll-loops or -fpeel-loops. > Loop unrolling may make it bit more useful because it increases expected basic > block size and thus enables scheduler to do more. > > Rather it depends on target whether register renaming is win. On out of order > targets > where scheduling is not that imprtant it does not seem to be worth the > compile time > and code size effects, while on targets that depends on scheduling it does. > Tonight tests on x86 shows improvmeents in botan which I think are related to > renaming > (because they also improved when Bern temporarily enabled it few days back), > measurable code size regression (which is probably fixable if we make > rename-registers > to not increase instruction encoding size) and noticeable compile time > slowdown > (not sure if it is from regrename itself that is quite easy pass or from the > fact > that scheduler has more freedom and thus work harder). > > On the other hand both SPECint and SPECfp improved by about 2% on Itanium and > code size reduced. I think similar effect should happen on superscalar > inorder > tarets with constant size of instruction encoding. > > I would say we should handle -frename-registers same way as we do > -fschedule-insns. > I.e. enable/disable it on target basis rather than budnle it to loop > unrolling. > But that is for future patch. I will commit this and propose patch making > -fpeel-loops > independent of rename-registers next Ok. Does -fweb still do what its documentation says (enable register allocation to work on pseudos directly)? Given its downside and strong GIMPLE optimizations maybe it is time to remove it? Thanks, Richard. > Honza > > Bootstrapped/regtested x86_64-linux, will commit it shortly. > > * loop-init.c (gate): Do not enale RTL loop unroller with > -fpeel-loops. > It no longer does that. > * toplev.c (process_options): Do not enable flag_web with > -fpeel-loops. > Index: loop-init.c > === > --- loop-init.c (revision 236894) > +++ loop-init.c (working copy) > @@ -560,7 +560,7 @@ public: >/* opt_pass methods: */ >virtual bool gate (function *) > { > - return (flag_peel_loops || flag_unroll_loops || flag_unroll_all_loops); > + return (flag_unroll_loops || flag_unroll_all_loops); > } > >virtual unsigned int execute (function *); > Index: toplev.c > === > --- toplev.c(revision 236894) > +++ toplev.c(working copy) > @@ -1296,7 +1296,7 @@ process_options (void) > >/* web and rename-registers help when run after loop unrolling. */ >if (flag_web == AUTODETECT_VALUE) > -flag_web = flag_unroll_loops || flag_peel_loops; > +flag_web = flag_unroll_loops; > >if (flag_rename_registers == AUTODETECT_VALUE) > flag_rename_registers = flag_unroll_loops || flag_peel_loops;