I've tested merging for nested branches on icc, and it seems that icc does a branch merge for code that might trap, making a more aggressive optimization. Way_.cpp struct waymapt { int fillnum; int num; }; typedef waymapt* waymappt;
class wayobj { public: int boundl; waymappt waymap; int makebound2(int fillnum, int iters); }; int wayobj::makebound2(int fillnum, int iters) { for (int i = 0; i < iters; i++) { if (waymap[i].fillnum!=fillnum) if (waymap[i].num!=0) boundl++; } return boundl; } compile commandline icpc -c -o Way_.o -g -O3 Way_.cpp The instructions generated cmp (%r11,%r9,1),%esi setne %bpl xor %ecx,%ecx cmpl $0x0,0x4(%r11,%r9,1) setne %cl and %ecx,%ebp cmp $0x1,%ebp jne 49 <_ZN6wayobj10makebound2Eii+0x49> ywgrit <yw987194...@gmail.com> 于2025年7月20日周日 11:11写道: > Can we add a -merge-branch option to merge branch bbs when the programmer > can ensure that the inner branch bb will not trap? > Also, the current ifcombine pass can only merge very simple nested > branches, and if statements usually generate multiple gimple statements, so > a lot of merge opportunities are lost. For example, the hotspot function in > speccpu 2006's 473.astar program contains two nested branches, we did an > experiment with the environment:gcc-12.3.0, linux 5.15.0, intel core > i7-10750h, and after the experiment, compared to generating two branch > instructions, if the nested branches of the hotspot function are compiled > into one branch instruction. There will be a 30% improvement in performance. > If there are indirect accesses in the if statement, the branch prediction > is probably not accurate, so I think it's important to maximize the chances > of merging as much as possible, e.g. by adding a -merge-branch option as > described above. > > Richard Biener <rguent...@suse.de> 于2025年7月18日周五 22:37写道: > >> On Fri, 18 Jul 2025, ywgrit wrote: >> >> > For now, if combine pass can combine the simple nested comparison >> branches, >> > e.g. >> > if (a != b) >> > if (c == d) >> > These cond bbs must have only the conditional, which is too harsh. >> > >> > We often meet code like this: >> > if (a != b) >> > if (m[index] == k[index]) >> > m and c are arrays, so the 2nd branch belongs to a bb that has mem_ref >> > gimples and these stmts could trap. So these stmts won't pass the >> > bb_no_side_effects_p check, the branches can't be merged and performance >> > gains are lost, what are the way to merge these branches bb? >> > I think there are extremely many such nested branches and probably the >> > prediction accuracy of such nested branches is not very high, so doing >> > branch merging will result in high performance gain. >> >> Without actual data I do not believe such general claim. But the issue >> is that we cannot speculate the loads from m[index] or k[index] when >> they might trap, so there is no way to merge the branches. >> >> Intel APX introduces conditional moves that hide traps, so with that >> you could do >> >> flag = a != b; >> cmov<flag> m[index], reg1 >> cmov<flag> k[index], reg2 >> if (flag && reg1 == reg2) >> >> but there is no way to do this in ifcombine on GIMPLE. It would >> also be slower in case if (a != b) is well predicted and mostly >> false. >> >> Richard. >> >