https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125557
--- Comment #10 from ktkachov at gcc dot gnu.org --- (In reply to Drea Pinski from comment #9) > (In reply to Drea Pinski from comment #8) > > (In reply to ktkachov from comment #7) > > > (In reply to Drea Pinski from comment #6) > > > > Note the final ifcvt part needs to have a decent cost model; right now > > > > your > > > > patches don't have one and it is not obvious if it is always better > > > > especially on targets which don't have a cmov (or cmov like > > > > instruction; e.g > > > > riscv's czero.nez/czero.eqz). We need much more time on figuring that > > > > part > > > > out. Either we should improve ifcvt or have something in the last > > > > phiopt. > > > > Having it as part of sink breaks what sink is doing. > > > > > > Yeah, that makes sense. With your latest patches at > > > https://gcc.gnu.org/pipermail/gcc-patches/2026-June/720191.html we do get > > > the right factoring on Snappy. RTL ifcvt then does the final ifcvt but > > > only > > > at -O2. At -O3 -fsplit-paths messes up the basic blocks. > > > > If split paths is messing it up, then poor_ifcvt_pred in > > gimple-ssa-split-paths.cc should be tweaked more; I added this function in > > 2024 (r15-3436-gb2b20b277988ab) to stop split pathes from splitting > > ifcvtable things because it was doing exactly that. > > Or split paths need to be moved later after phiopt. split paths has > definitely been a trouble pass. It was also originally moved from rtl to > gimple and when that was done it was moved right after the loop > optimizations but before ifcvt. It was moved there because the idea was > there could be some cleanups done after the patch was split. But I am not so > sure. > > Anyways see PR 68541, PR 112402, PR 120892. > > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120892#c3 which talks about > maybe removing it. > > Maybe you can do some benchmarking of removing it? and if your data backs it > up; I think proposing to removing it might be a good thing. So it turns out -fsplit-paths messes up the if-conversion in the reduced testcase here but doesn't do anything on the full Snappy workload. The if-conversion there doesn't happen there for other reasons. I'll see if I can extract a new testcase for that. But it looks like we'll still need to teach something in the pipeline to do the if-conversion, be it RTL ifcvt or some phiopt pass. I did gather some data for aarch64 on SPEC2026 for -fsplit-paths and -fno-split-paths For a 1-iteration run at -O3 its effect seems to be in the noise, I am rerunning for more iterations to confirm. The static data is: * Successful Duplicating join block: 61,708 * declines due to spoiling ifcvt: 37,335 * declines due to no CSE/DCE/jump-thread opportunity: 21,620 * declines due to too many stmts: 1,955 * Overall increase in instructions: 0.401% * Overall increase in .text: 0.342% I'll try with LTO as well and get 3-iterations results too.
