On Sat, 19 Jul 2025, Andrew Pinski wrote: > On Sat, Jul 19, 2025 at 8:41 PM ywgrit via Gcc <gcc@gcc.gnu.org> wrote: > > > > I've tested merging for nested branches on icc, and it seems that icc does > > a branch merge for code that might trap, making a more aggressive > > optimization. > > So it is not exactly it might trap but rather it is part of a bigger > struct and it is an adjacent location. > We do some of this already in phiopt see hoist_adjacent_loads in > tree-ssa-phiopt.cc which handles a similar but different case. > Added here originally > (https://inbox.sourceware.org/gcc-patches/1336055636.22269.16.camel@gnopaine/). > Seems like a similar code could be done for ifcombine. I thought I saw > some improvements dealing with load handling that happened for GCC 15.
Not for this particular case I think, though we could implement this as *(unsigned long *)(&waymap[i].fillnum) != (unsigned long)fillnum << 32 this is what the enhancements were about. Can you open a bugreport for tracking that ifcombine might benefit from a non-trap enhancement like phiopt has for adjacent load hosting? Thanks, Richard. > > Thanks, > Andrew > > > > > Way_.cpp > > struct waymapt > > { > > int fillnum; > > int num; > > }; > > typedef waymapt* waymappt; > > > > class wayobj > > { > > public: > > int boundl; > > waymappt waymap; int makebound2(int fillnum, int iters); > > }; > > > > int wayobj::makebound2(int fillnum, int iters) > > { > > for (int i = 0; i < iters; i++) > > { > > if (waymap[i].fillnum!=fillnum) > > if (waymap[i].num!=0) > > boundl++; > > } > > return boundl; > > } > > > > compile commandline > > icpc -c -o Way_.o -g -O3 Way_.cpp > > > > The instructions generated > > cmp (%r11,%r9,1),%esi > > setne %bpl > > xor %ecx,%ecx > > cmpl $0x0,0x4(%r11,%r9,1) > > setne %cl > > and %ecx,%ebp > > cmp $0x1,%ebp > > jne 49 <_ZN6wayobj10makebound2Eii+0x49> > > > > ywgrit <yw987194...@gmail.com> 于2025年7月20日周日 11:11写道: > > > > > Can we add a -merge-branch option to merge branch bbs when the programmer > > > can ensure that the inner branch bb will not trap? > > > Also, the current ifcombine pass can only merge very simple nested > > > branches, and if statements usually generate multiple gimple statements, > > > so > > > a lot of merge opportunities are lost. For example, the hotspot function > > > in > > > speccpu 2006's 473.astar program contains two nested branches, we did an > > > experiment with the environment:gcc-12.3.0, linux 5.15.0, intel core > > > i7-10750h, and after the experiment, compared to generating two branch > > > instructions, if the nested branches of the hotspot function are compiled > > > into one branch instruction. There will be a 30% improvement in > > > performance. > > > If there are indirect accesses in the if statement, the branch prediction > > > is probably not accurate, so I think it's important to maximize the > > > chances > > > of merging as much as possible, e.g. by adding a -merge-branch option as > > > described above. > > > > > > Richard Biener <rguent...@suse.de> 于2025年7月18日周五 22:37写道: > > > > > >> On Fri, 18 Jul 2025, ywgrit wrote: > > >> > > >> > For now, if combine pass can combine the simple nested comparison > > >> branches, > > >> > e.g. > > >> > if (a != b) > > >> > if (c == d) > > >> > These cond bbs must have only the conditional, which is too harsh. > > >> > > > >> > We often meet code like this: > > >> > if (a != b) > > >> > if (m[index] == k[index]) > > >> > m and c are arrays, so the 2nd branch belongs to a bb that has mem_ref > > >> > gimples and these stmts could trap. So these stmts won't pass the > > >> > bb_no_side_effects_p check, the branches can't be merged and > > >> > performance > > >> > gains are lost, what are the way to merge these branches bb? > > >> > I think there are extremely many such nested branches and probably the > > >> > prediction accuracy of such nested branches is not very high, so doing > > >> > branch merging will result in high performance gain. > > >> > > >> Without actual data I do not believe such general claim. But the issue > > >> is that we cannot speculate the loads from m[index] or k[index] when > > >> they might trap, so there is no way to merge the branches. > > >> > > >> Intel APX introduces conditional moves that hide traps, so with that > > >> you could do > > >> > > >> flag = a != b; > > >> cmov<flag> m[index], reg1 > > >> cmov<flag> k[index], reg2 > > >> if (flag && reg1 == reg2) > > >> > > >> but there is no way to do this in ifcombine on GIMPLE. It would > > >> also be slower in case if (a != b) is well predicted and mostly > > >> false. > > >> > > >> Richard. > > >> > > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)