Re: RFC [1/2] divmod transform

Richard Biener Wed, 25 May 2016 00:24:01 -0700

On Tue, 24 May 2016, Prathamesh Kulkarni wrote:

> On 24 May 2016 at 19:39, Richard Biener <rguent...@suse.de> wrote:
> > On Tue, 24 May 2016, Prathamesh Kulkarni wrote:
> >
> >> On 24 May 2016 at 17:42, Richard Biener <rguent...@suse.de> wrote:
> >> > On Tue, 24 May 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 23 May 2016 at 17:35, Richard Biener <richard.guent...@gmail.com> 
> >> >> wrote:
> >> >> > On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni
> >> >> > <prathamesh.kulka...@linaro.org> wrote:
> >> >> >> Hi,
> >> >> >> I have updated my patch for divmod (attached), which was originally
> >> >> >> based on Kugan's patch.
> >> >> >> The patch transforms stmts with code TRUNC_DIV_EXPR and 
> >> >> >> TRUNC_MOD_EXPR
> >> >> >> having same operands to divmod representation, so we can cse 
> >> >> >> computation of mod.
> >> >> >>
> >> >> >> t1 = a TRUNC_DIV_EXPR b;
> >> >> >> t2 = a TRUNC_MOD_EXPR b
> >> >> >> is transformed to:
> >> >> >> complex_tmp = DIVMOD (a, b);
> >> >> >> t1 = REALPART_EXPR (complex_tmp);
> >> >> >> t2 = IMAGPART_EXPR (complex_tmp);
> >> >> >>
> >> >> >> * New hook divmod_expand_libfunc
> >> >> >> The rationale for introducing the hook is that different targets have
> >> >> >> incompatible calling conventions for divmod libfunc.
> >> >> >> Currently three ports define divmod libfunc: c6x, spu and arm.
> >> >> >> c6x and spu follow the convention of libgcc2.c:__udivmoddi4:
> >> >> >> return quotient and store remainder in argument passed as pointer,
> >> >> >> while the arm version takes two arguments and returns both
> >> >> >> quotient and remainder having mode double the size of the operand 
> >> >> >> mode.
> >> >> >> The port should hence override the hook expand_divmod_libfunc
> >> >> >> to generate call to target-specific divmod.
> >> >> >> Ports should define this hook if:
> >> >> >> a) The port does not have divmod or div insn for the given mode.
> >> >> >> b) The port defines divmod libfunc for the given mode.
> >> >> >> The default hook default_expand_divmod_libfunc() generates call
> >> >> >> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and
> >> >> >> are of DImode.
> >> >> >>
> >> >> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and
> >> >> >> cross-tested on arm*-*-*.
> >> >> >> Bootstrap+test in progress on arm-linux-gnueabihf.
> >> >> >> Does this patch look OK ?
> >> >> >
> >> >> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> >> >> > index 6b4601b..e4a021a 100644
> >> >> > --- a/gcc/targhooks.c
> >> >> > +++ b/gcc/targhooks.c
> >> >> > @@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode,
> >> >> > machine_mode, optimization_type)
> >> >> >    return true;
> >> >> >  }
> >> >> >
> >> >> > +void
> >> >> > +default_expand_divmod_libfunc (bool unsignedp, machine_mode mode,
> >> >> > +                              rtx op0, rtx op1,
> >> >> > +                              rtx *quot_p, rtx *rem_p)
> >> >> >
> >> >> > functions need a comment.
> >> >> >
> >> >> > ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4 style? 
> >> >> >  In that
> >> >> > case we could avoid the target hook.
> >> >> Well I would prefer adding the hook because that's more easier -;)
> >> >> Would it be ok for now to go with the hook ?
> >> >> >
> >> >> > +      /* If target overrides expand_divmod_libfunc hook
> >> >> > +        then perform divmod by generating call to the target-specifc 
> >> >> > divmod
> >> >> > libfunc.  */
> >> >> > +      if (targetm.expand_divmod_libfunc != 
> >> >> > default_expand_divmod_libfunc)
> >> >> > +       return true;
> >> >> > +
> >> >> > +      /* Fall back to using libgcc2.c:__udivmoddi4.  */
> >> >> > +      return (mode == DImode && unsignedp);
> >> >> >
> >> >> > I don't understand this - we know optab_libfunc returns non-NULL for 
> >> >> > 'mode'
> >> >> > but still restrict this to DImode && unsigned?  Also if
> >> >> > targetm.expand_divmod_libfunc
> >> >> > is not the default we expect the target to handle all modes?
> >> >> Ah indeed, the check for DImode is unnecessary.
> >> >> However I suppose the check for unsignedp should be there,
> >> >> since we want to generate call to __udivmoddi4 only if operand is 
> >> >> unsigned ?
> >> >
> >> > The optab libfunc for sdivmod should be NULL in that case.
> >> Ah indeed, thanks.
> >> >
> >> >> >
> >> >> > That said - I expected the above piece to be simply a 'return true;' 
> >> >> > ;)
> >> >> >
> >> >> > Usually we use some can_expand_XXX helper in optabs.c to query if the 
> >> >> > target
> >> >> > supports a specific operation (for example SImode divmod would use 
> >> >> > DImode
> >> >> > divmod by means of widening operands - for the unsigned case of 
> >> >> > course).
> >> >> Thanks for pointing out. So if a target does not support divmod
> >> >> libfunc for a mode
> >> >> but for a wider mode, then we could zero-extend operands to the 
> >> >> wider-mode,
> >> >> perform divmod on the wider-mode, and then cast result back to the
> >> >> original mode.
> >> >> I haven't done that in this patch, would it be OK to do that as a 
> >> >> follow up ?
> >> >
> >> > I think that you should conservatively handle the div_optab query, thus 
> >> > if
> >> > the target has a HW division in a wider mode don't use the divmod IFN.
> >> > You'd simply iterate over GET_MODE_WIDER_MODE and repeat the
> >> > if (optab_handler (div_optab, mode) != CODE_FOR_nothing) check, bailing
> >> > out if that is available.
> >> Done.
> >> >
> >> >> > +  /* Disable the transform if either is a constant, since
> >> >> > division-by-constant
> >> >> > +     may have specialized expansion.  */
> >> >> > +  if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2))
> >> >> > +    return false;
> >> >> >
> >> >> > please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2)
> >> >> >
> >> >> > +  if (TYPE_OVERFLOW_TRAPS (type))
> >> >> > +    return false;
> >> >> >
> >> >> > why's that?  Generally please first test cheap things (trapping, 
> >> >> > constant-ness)
> >> >> > before checking expensive stuff (target_supports_divmod_p).
> >> >> I added TYPE_OVERFLOW_TRAPS (type) based on your suggestion in:
> >> >> https://www.mail-archive.com/gcc@gcc.gnu.org/msg78534.html
> >> >> "When looking at TRUNC_DIV_EXPR you should also exclude
> >> >> the case where TYPE_OVERFLOW_TRAPS (type) as that should
> >> >> expand using the [su]divv optabs (no trapping overflow
> >> >> divmod optab exists)."
> >> >
> >> > Ok, didn't remember that.
> >> >
> >> >> >
> >> >> > +static bool
> >> >> > +convert_to_divmod (gassign *stmt)
> >> >> > +{
> >> >> > +  if (!divmod_candidate_p (stmt))
> >> >> > +    return false;
> >> >> > +
> >> >> > +  tree op1 = gimple_assign_rhs1 (stmt);
> >> >> > +  tree op2 = gimple_assign_rhs2 (stmt);
> >> >> > +
> >> >> > +  vec<gimple *> stmts = vNULL;
> >> >> >
> >> >> > use an auto_vec <gimple *> - you currently leak it in at least one 
> >> >> > place.
> >> >> >
> >> >> > +      if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
> >> >> > +       cfg_changed = true;
> >> >> >
> >> >> > note that this suggests you should check whether any of the stmts may 
> >> >> > throw
> >> >> > internally as you don't update / transfer EH info correctly.  So for 
> >> >> > 'stmt' and
> >> >> > all 'use_stmt' check stmt_can_throw_internal and if so do not add it 
> >> >> > to
> >> >> > the list of stmts to modify.
> >> >> >
> >> >> > Btw, I think you should not add 'stmt' immediately but when iterating 
> >> >> > over
> >> >> > all uses also gather uses in TRUNC_MOD_EXPR.
> >> >> >
> >> >> > Otherwise looks ok.
> >> >> Done changes in this version. I am gathering mod uses same time as div 
> >> >> uses,
> >> >> so this imposes a constraint that mod dominates mod. I am not sure if
> >> >> that's desirable.
> >> >
> >> > I think you also need a mod_seen variable now that you don't necessarily
> >> > end up with 'stmt' in the vector of stmts.  I don't see how there is a
> >> > constraint that mod dominates mod - it's just that the top_stmt needs
> >> > to dominate all other uses that can be replaced with replacing top_stmt
> >> > with a divmod.  It's just that the actual stmt set we choose may now
> >> > depend on immediate uses order which on a second thought is bad
> >> > as immediate uses order could be affected by debug stmts ... hmm.
> >> >
> >> > To avoid this please re-add the code adding 'stmt' to stmts immediately
> >> > and add a use_stmt != stmt check to the immediate use processing loop
> >> > so that we don't end up adding it twice.
> >> Well I wonder what will happen for the following case:
> >> t1 = x / y;
> >> if (cond)
> >>   t2 = x % y;
> >> else
> >>   t3 = x % y;
> >>
> >> Assuming stmt == top_stmt is "t2 = x % y" and use_stmt is "t3 = x % y",
> >> use_stmt will not get added to stmts vector, since top_stmt and
> >> use_stmt are not in same bb,
> >> and bb's containing top_stmt and use_stmt don't dominate each other.
> >> Not sure if this is practical case (I assume fre will hoist mod
> >> outside if-else?)
> >>
> >> Now that we immediately add stmt to stmts vector, I suppose mod_seen
> >> shall not be required ?
> >
> > In that case mod_seen is not needed.  But the situation you say will
> > still happen so I wonder if we'd need a better way of iterating over
> > immediate uses, like first pushing all candidates into a worklist
> > vector and then iterating over that until we find no more candidates.
> >
> > You can then also handle the case of more than one group of stmts
> > (the pass currently doesn't iterate in any particular useful order
> > over BBs).
> IIUC, we want to perform the transform if:
> i) there exists top_stmt with code trunc_div_expr/trunc_mod_expr and
> having same operands as stmt.
> ii) top_stmt dominates all other stmts with code
> trunc_div_expr/trunc_mod_expr and having same operands as top_stmt.
> 
> Firstly, we try to get to top_stmt if it exists, by iterating over uses of 
> stmt,
> and then iterate over all uses of top_stmt and add them to stmts vector
> only if top_stmt dominates all the stmts with same operands as top_stmt
> and have code trunc_div_expr/trunc_mod_expr.
> 
> /* Get to top_stmt.  */
> top_stmt = stmt;
> top_bb = gimple_bb (stmt);
> 
> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1)
> {
>   if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR
>       && use_stmt has same operands as stmt)
>     {
>       if (gimple_bb (use_stmt) dominates top_bb)
>         {
>           top_bb = gimple_bb (use_stmt);
>           top_stmt = use_stmt;
>         }
>       else if (gimple_bb (use_stmt) == top_stmt
>                && gimple_uid (use_stmt) < top_stmt)
>         top_stmt = use_stmt;
>     }
> }
> 
> /* Speculatively consider top_stmt as dominating all other
> div_expr/mod_expr stmts with same operands as stmt.  */
> stmts.safe_push (top_stmt);
> 
> /* Walk uses of top_stmt to ensure that all stmts are dominated by top_stmt.  
> */
> top_op1 = gimple_assign_rhs1 (top_stmt);
> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1)
> {
>   if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR
>       && use_stmt has same operands as top_stmt)
>     {
>       if (use_stmt == top_stmt)
>         continue;
> 
>       /* No top_stmt exits, do not proceed with transform  */
>       if (top_bb does not dominate gimple_bb (use_stmt))
>         return false;
> 
>       stmts.safe_push (use_stmt);
>     }
> }
> 
> For the case:
> t1 = x / y;
> if (cond)
>   t2 = x % y;
> else
>   t3 = x % y;
> 
> Assuming stmt is "t2 = x % y", it will walk uses of stmt and set
> top_stmt to "t1 = x / y"
> Then it will walk all uses of top_stmt:
> "t2 = x % y" -> dominated by top_stmt
> "t3 = x % y" -> dominated by top_stmt
> Since all stmts are dominated by top_stmt, we add all three stmts to
> vector of stmts and proceed with transform.
> 
> For the case where, top_stmt dominates original stmt but not all stmts:
> 
> if (cond)
>   t1 = x / y;
> else
> {
>   t2 = x % y;
>   return;
> }
> 
> t3 = x % y;
> 
> Assuming stmt is "t3 = x % y",
> Walking stmt uses will set top_stmt to "t1 = x / y";
> 
> Walking immediate uses of top_stmt, we find that "t2 = x % y" is not
> dominated by top_stmt,
> and hence don't do the transform.
> 
> Does this sound reasonable ?


Yes, that's reasonable.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: RFC [1/2] divmod transform

Reply via email to