https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78319

--- Comment #8 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Richard Biener from comment #7)
> (In reply to prathamesh3492 from comment #6)
> > (In reply to Richard Biener from comment #5)
> > > It's a matter of costs (here BRANCH_COST and its uses in fold and 
> > > ifcombine).
> > > 
> > > You don't mention what IL differences your patch causes (I'll check soon
> > > myself).
> > The difference caused by r249195 is the following in forwprop dump on
> > cortex-m7:
> > 
> > before:
> > <bb 2>:
> >   _1 = n_20(D) != 0;
> >   _2 = m_21(D) != 0;
> >   _3 = _1 | _2;
> >   if (_3 != 0)
> >     goto <bb 4>;
> >   else
> >     goto <bb 3>;
> > 
> > after:
> > <bb 2>:
> >   _1 = n_20(D) != 0;
> >   _2 = m_21(D) != 0;
> >   _25 = n_20(D) | m_21(D);
> >   if (_25 != 0)
> >     goto <bb 4>;
> >   else
> >     goto <bb 3>;
> > 
> > _3 = _1 | _2 is replaced by _25 = n_20(D) | m_21(D)
> 
> But there are still uses of _1 (or _2) left, right?
> 
> So it might be that tree-ssa-uninit.c simply needs to be taught
> that X | Y != 0 means X != 0 || Y != 0.  I think this is done
> in normalize_preds though it looks like it is already handled
> to some extent.  But I see
> 
> [BEFORE NORMALIZATION --[USE]:
> blah (v_30);
> is guarded by :
> 
>  (.NOT.) _1 != 0 (.AND.) _5 != 0
> 
> 
> [AFTER NORMALIZATION -- [USE]:
> blah (v_30);
> is guarded by :
> 
> _5 != 0 (.AND.)  (.NOT.) _1 != 0
> 
> while
> 
> [BEFORE NORMALIZATION --[DEF]:
> v_30 = PHI <r_12(D)(5), v_29(6)>
> is guarded by :
> 
> m_11(D) != 0
> (.OR.)
> _1 != 0
> (.OR.)
> _2 != 0
> 
> 
> [AFTER NORMALIZATION -- [DEF]:
> v_30 = PHI <r_12(D)(5), v_29(6)>
> is guarded by :
> 
> m_11(D) != 0
> (.OR.)
> m_11(D) != 0
> (.OR.)
> n_10(D) != 0
> (.OR.)
> l_13(D) != 0
> (.OR.)
> r_12(D) != 0
> 
> so somehow it expands it in one case but not in the other.
> 
> Can you investigate?
I will give a try.
> 
> > forwprop dump before:
> > http://pastebin.com/vdTs1B0V
> > 
> > forwprop dump after:
> > http://pastebin.com/XuYVGG0z
> > > For the issue at hand I suggest to XFAIL for affected architectures.
> > Ok thanks, I will xfail this test on arm-none-eabi.
> > Ideally I would like to xfail only for cortex-m7 (and not other 
> > sub-targets).
> > Is it possible to check which sub-target is in effect with dejagnu ?
> 
> I don't think so.
Ok thanks, in that case I will XFAIL it on arm-none-eabi.

Thanks,
Prathamesh
> 
> > Thanks,
> > Prathamesh
> > > 
> > > Generally the late uninit pass needs a rewrite to be conservative (make 
> > > its
> > > data-flow compute must-be-may-uninitialized rather than erring on the 
> > > false
> > > positive side when its analysis gives up).
> > > 
> > > A good research project would be to write an IPA static analysis pass that
> > > performs at least some trivial "optimization" itself (constant folding
> > > and propagation) but does not do any IL changes.

Reply via email to