On Sat, 19 Jul 2025, Andrew Pinski wrote:

> On Sat, Jul 19, 2025 at 8:41 PM ywgrit via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > I've tested merging for nested branches on icc, and it seems that icc does
> > a branch merge for code that might trap, making a more aggressive
> > optimization.
> 
> So it is not exactly it might trap but rather it is part of a bigger
> struct and it is an adjacent location.
> We do some of this already in phiopt see hoist_adjacent_loads in
> tree-ssa-phiopt.cc which handles a similar but different case.
> Added here originally
> (https://inbox.sourceware.org/gcc-patches/1336055636.22269.16.camel@gnopaine/).
> Seems like a similar code could be done for ifcombine. I thought I saw
> some improvements dealing with load handling that happened for GCC 15.

Not for this particular case I think, though we could implement
this as

*(unsigned long *)(&waymap[i].fillnum) != (unsigned long)fillnum << 32

this is what the enhancements were about.  Can you open a bugreport
for tracking that ifcombine might benefit from a non-trap enhancement
like phiopt has for adjacent load hosting?

Thanks,
Richard.

> 
> Thanks,
> Andrew
> 
> 
> 
> > Way_.cpp
> > struct waymapt
> > {
> >   int fillnum;
> >   int num;
> > };
> > typedef waymapt* waymappt;
> >
> > class wayobj
> > {
> > public:
> >     int boundl;
> >     waymappt waymap;    int makebound2(int fillnum, int iters);
> > };
> >
> > int wayobj::makebound2(int fillnum, int iters)
> > {
> >   for (int i = 0; i < iters; i++)
> >   {
> >     if (waymap[i].fillnum!=fillnum)
> >       if (waymap[i].num!=0)
> >         boundl++;
> >   }
> >   return boundl;
> > }
> >
> > compile commandline
> > icpc -c -o Way_.o -g -O3 Way_.cpp
> >
> > The instructions generated
> > cmp    (%r11,%r9,1),%esi
> > setne  %bpl
> > xor    %ecx,%ecx
> > cmpl   $0x0,0x4(%r11,%r9,1)
> > setne  %cl
> > and    %ecx,%ebp
> > cmp    $0x1,%ebp
> > jne    49 <_ZN6wayobj10makebound2Eii+0x49>
> >
> > ywgrit <yw987194...@gmail.com> 于2025年7月20日周日 11:11写道:
> >
> > > Can we add a -merge-branch option to merge branch bbs when the programmer
> > > can ensure that the inner branch bb will not trap?
> > > Also, the current ifcombine pass can only merge very simple nested
> > > branches, and if statements usually generate multiple gimple statements, 
> > > so
> > > a lot of merge opportunities are lost. For example, the hotspot function 
> > > in
> > > speccpu 2006's 473.astar program contains two nested branches, we did an
> > > experiment with the environment:gcc-12.3.0, linux 5.15.0, intel core
> > > i7-10750h, and after the experiment, compared to generating two branch
> > > instructions, if the nested branches of the hotspot function are compiled
> > > into one branch instruction. There will be a 30% improvement in 
> > > performance.
> > > If there are indirect accesses in the if statement, the branch prediction
> > > is probably not accurate, so I think it's important to maximize the 
> > > chances
> > > of merging as much as possible, e.g. by adding a -merge-branch option as
> > > described above.
> > >
> > > Richard Biener <rguent...@suse.de> 于2025年7月18日周五 22:37写道:
> > >
> > >> On Fri, 18 Jul 2025, ywgrit wrote:
> > >>
> > >> > For now, if combine pass can combine the simple nested comparison
> > >> branches,
> > >> > e.g.
> > >> > if (a != b)
> > >> >   if (c == d)
> > >> > These cond bbs must have only the conditional, which is too harsh.
> > >> >
> > >> > We often meet code like this:
> > >> > if (a != b)
> > >> >   if (m[index] == k[index])
> > >> > m and c are arrays, so the 2nd branch belongs to a bb that has mem_ref
> > >> > gimples and these stmts could trap. So these stmts won't pass the
> > >> > bb_no_side_effects_p check, the branches can't be merged and 
> > >> > performance
> > >> > gains are lost, what are the way to merge these branches bb?
> > >> > I think there are extremely many such nested branches and probably the
> > >> > prediction accuracy of such nested branches is not very high, so doing
> > >> > branch merging will  result in high performance gain.
> > >>
> > >> Without actual data I do not believe such general claim.  But the issue
> > >> is that we cannot speculate the loads from m[index] or k[index] when
> > >> they might trap, so there is no way to merge the branches.
> > >>
> > >> Intel APX introduces conditional moves that hide traps, so with that
> > >> you could do
> > >>
> > >>  flag = a != b;
> > >>  cmov<flag> m[index], reg1
> > >>  cmov<flag> k[index], reg2
> > >>  if (flag && reg1 == reg2)
> > >>
> > >> but there is no way to do this in ifcombine on GIMPLE.  It would
> > >> also be slower in case if (a != b) is well predicted and mostly
> > >> false.
> > >>
> > >> Richard.
> > >>
> > >
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to