On Tue, 24 May 2016, Prathamesh Kulkarni wrote: > On 24 May 2016 at 19:39, Richard Biener <rguent...@suse.de> wrote: > > On Tue, 24 May 2016, Prathamesh Kulkarni wrote: > > > >> On 24 May 2016 at 17:42, Richard Biener <rguent...@suse.de> wrote: > >> > On Tue, 24 May 2016, Prathamesh Kulkarni wrote: > >> > > >> >> On 23 May 2016 at 17:35, Richard Biener <richard.guent...@gmail.com> > >> >> wrote: > >> >> > On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni > >> >> > <prathamesh.kulka...@linaro.org> wrote: > >> >> >> Hi, > >> >> >> I have updated my patch for divmod (attached), which was originally > >> >> >> based on Kugan's patch. > >> >> >> The patch transforms stmts with code TRUNC_DIV_EXPR and > >> >> >> TRUNC_MOD_EXPR > >> >> >> having same operands to divmod representation, so we can cse > >> >> >> computation of mod. > >> >> >> > >> >> >> t1 = a TRUNC_DIV_EXPR b; > >> >> >> t2 = a TRUNC_MOD_EXPR b > >> >> >> is transformed to: > >> >> >> complex_tmp = DIVMOD (a, b); > >> >> >> t1 = REALPART_EXPR (complex_tmp); > >> >> >> t2 = IMAGPART_EXPR (complex_tmp); > >> >> >> > >> >> >> * New hook divmod_expand_libfunc > >> >> >> The rationale for introducing the hook is that different targets have > >> >> >> incompatible calling conventions for divmod libfunc. > >> >> >> Currently three ports define divmod libfunc: c6x, spu and arm. > >> >> >> c6x and spu follow the convention of libgcc2.c:__udivmoddi4: > >> >> >> return quotient and store remainder in argument passed as pointer, > >> >> >> while the arm version takes two arguments and returns both > >> >> >> quotient and remainder having mode double the size of the operand > >> >> >> mode. > >> >> >> The port should hence override the hook expand_divmod_libfunc > >> >> >> to generate call to target-specific divmod. > >> >> >> Ports should define this hook if: > >> >> >> a) The port does not have divmod or div insn for the given mode. > >> >> >> b) The port defines divmod libfunc for the given mode. > >> >> >> The default hook default_expand_divmod_libfunc() generates call > >> >> >> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and > >> >> >> are of DImode. > >> >> >> > >> >> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and > >> >> >> cross-tested on arm*-*-*. > >> >> >> Bootstrap+test in progress on arm-linux-gnueabihf. > >> >> >> Does this patch look OK ? > >> >> > > >> >> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c > >> >> > index 6b4601b..e4a021a 100644 > >> >> > --- a/gcc/targhooks.c > >> >> > +++ b/gcc/targhooks.c > >> >> > @@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode, > >> >> > machine_mode, optimization_type) > >> >> > return true; > >> >> > } > >> >> > > >> >> > +void > >> >> > +default_expand_divmod_libfunc (bool unsignedp, machine_mode mode, > >> >> > + rtx op0, rtx op1, > >> >> > + rtx *quot_p, rtx *rem_p) > >> >> > > >> >> > functions need a comment. > >> >> > > >> >> > ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4 style? > >> >> > In that > >> >> > case we could avoid the target hook. > >> >> Well I would prefer adding the hook because that's more easier -;) > >> >> Would it be ok for now to go with the hook ? > >> >> > > >> >> > + /* If target overrides expand_divmod_libfunc hook > >> >> > + then perform divmod by generating call to the target-specifc > >> >> > divmod > >> >> > libfunc. */ > >> >> > + if (targetm.expand_divmod_libfunc != > >> >> > default_expand_divmod_libfunc) > >> >> > + return true; > >> >> > + > >> >> > + /* Fall back to using libgcc2.c:__udivmoddi4. */ > >> >> > + return (mode == DImode && unsignedp); > >> >> > > >> >> > I don't understand this - we know optab_libfunc returns non-NULL for > >> >> > 'mode' > >> >> > but still restrict this to DImode && unsigned? Also if > >> >> > targetm.expand_divmod_libfunc > >> >> > is not the default we expect the target to handle all modes? > >> >> Ah indeed, the check for DImode is unnecessary. > >> >> However I suppose the check for unsignedp should be there, > >> >> since we want to generate call to __udivmoddi4 only if operand is > >> >> unsigned ? > >> > > >> > The optab libfunc for sdivmod should be NULL in that case. > >> Ah indeed, thanks. > >> > > >> >> > > >> >> > That said - I expected the above piece to be simply a 'return true;' > >> >> > ;) > >> >> > > >> >> > Usually we use some can_expand_XXX helper in optabs.c to query if the > >> >> > target > >> >> > supports a specific operation (for example SImode divmod would use > >> >> > DImode > >> >> > divmod by means of widening operands - for the unsigned case of > >> >> > course). > >> >> Thanks for pointing out. So if a target does not support divmod > >> >> libfunc for a mode > >> >> but for a wider mode, then we could zero-extend operands to the > >> >> wider-mode, > >> >> perform divmod on the wider-mode, and then cast result back to the > >> >> original mode. > >> >> I haven't done that in this patch, would it be OK to do that as a > >> >> follow up ? > >> > > >> > I think that you should conservatively handle the div_optab query, thus > >> > if > >> > the target has a HW division in a wider mode don't use the divmod IFN. > >> > You'd simply iterate over GET_MODE_WIDER_MODE and repeat the > >> > if (optab_handler (div_optab, mode) != CODE_FOR_nothing) check, bailing > >> > out if that is available. > >> Done. > >> > > >> >> > + /* Disable the transform if either is a constant, since > >> >> > division-by-constant > >> >> > + may have specialized expansion. */ > >> >> > + if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2)) > >> >> > + return false; > >> >> > > >> >> > please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2) > >> >> > > >> >> > + if (TYPE_OVERFLOW_TRAPS (type)) > >> >> > + return false; > >> >> > > >> >> > why's that? Generally please first test cheap things (trapping, > >> >> > constant-ness) > >> >> > before checking expensive stuff (target_supports_divmod_p). > >> >> I added TYPE_OVERFLOW_TRAPS (type) based on your suggestion in: > >> >> https://www.mail-archive.com/gcc@gcc.gnu.org/msg78534.html > >> >> "When looking at TRUNC_DIV_EXPR you should also exclude > >> >> the case where TYPE_OVERFLOW_TRAPS (type) as that should > >> >> expand using the [su]divv optabs (no trapping overflow > >> >> divmod optab exists)." > >> > > >> > Ok, didn't remember that. > >> > > >> >> > > >> >> > +static bool > >> >> > +convert_to_divmod (gassign *stmt) > >> >> > +{ > >> >> > + if (!divmod_candidate_p (stmt)) > >> >> > + return false; > >> >> > + > >> >> > + tree op1 = gimple_assign_rhs1 (stmt); > >> >> > + tree op2 = gimple_assign_rhs2 (stmt); > >> >> > + > >> >> > + vec<gimple *> stmts = vNULL; > >> >> > > >> >> > use an auto_vec <gimple *> - you currently leak it in at least one > >> >> > place. > >> >> > > >> >> > + if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt)) > >> >> > + cfg_changed = true; > >> >> > > >> >> > note that this suggests you should check whether any of the stmts may > >> >> > throw > >> >> > internally as you don't update / transfer EH info correctly. So for > >> >> > 'stmt' and > >> >> > all 'use_stmt' check stmt_can_throw_internal and if so do not add it > >> >> > to > >> >> > the list of stmts to modify. > >> >> > > >> >> > Btw, I think you should not add 'stmt' immediately but when iterating > >> >> > over > >> >> > all uses also gather uses in TRUNC_MOD_EXPR. > >> >> > > >> >> > Otherwise looks ok. > >> >> Done changes in this version. I am gathering mod uses same time as div > >> >> uses, > >> >> so this imposes a constraint that mod dominates mod. I am not sure if > >> >> that's desirable. > >> > > >> > I think you also need a mod_seen variable now that you don't necessarily > >> > end up with 'stmt' in the vector of stmts. I don't see how there is a > >> > constraint that mod dominates mod - it's just that the top_stmt needs > >> > to dominate all other uses that can be replaced with replacing top_stmt > >> > with a divmod. It's just that the actual stmt set we choose may now > >> > depend on immediate uses order which on a second thought is bad > >> > as immediate uses order could be affected by debug stmts ... hmm. > >> > > >> > To avoid this please re-add the code adding 'stmt' to stmts immediately > >> > and add a use_stmt != stmt check to the immediate use processing loop > >> > so that we don't end up adding it twice. > >> Well I wonder what will happen for the following case: > >> t1 = x / y; > >> if (cond) > >> t2 = x % y; > >> else > >> t3 = x % y; > >> > >> Assuming stmt == top_stmt is "t2 = x % y" and use_stmt is "t3 = x % y", > >> use_stmt will not get added to stmts vector, since top_stmt and > >> use_stmt are not in same bb, > >> and bb's containing top_stmt and use_stmt don't dominate each other. > >> Not sure if this is practical case (I assume fre will hoist mod > >> outside if-else?) > >> > >> Now that we immediately add stmt to stmts vector, I suppose mod_seen > >> shall not be required ? > > > > In that case mod_seen is not needed. But the situation you say will > > still happen so I wonder if we'd need a better way of iterating over > > immediate uses, like first pushing all candidates into a worklist > > vector and then iterating over that until we find no more candidates. > > > > You can then also handle the case of more than one group of stmts > > (the pass currently doesn't iterate in any particular useful order > > over BBs). > IIUC, we want to perform the transform if: > i) there exists top_stmt with code trunc_div_expr/trunc_mod_expr and > having same operands as stmt. > ii) top_stmt dominates all other stmts with code > trunc_div_expr/trunc_mod_expr and having same operands as top_stmt. > > Firstly, we try to get to top_stmt if it exists, by iterating over uses of > stmt, > and then iterate over all uses of top_stmt and add them to stmts vector > only if top_stmt dominates all the stmts with same operands as top_stmt > and have code trunc_div_expr/trunc_mod_expr. > > /* Get to top_stmt. */ > top_stmt = stmt; > top_bb = gimple_bb (stmt); > > FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1) > { > if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR > && use_stmt has same operands as stmt) > { > if (gimple_bb (use_stmt) dominates top_bb) > { > top_bb = gimple_bb (use_stmt); > top_stmt = use_stmt; > } > else if (gimple_bb (use_stmt) == top_stmt > && gimple_uid (use_stmt) < top_stmt) > top_stmt = use_stmt; > } > } > > /* Speculatively consider top_stmt as dominating all other > div_expr/mod_expr stmts with same operands as stmt. */ > stmts.safe_push (top_stmt); > > /* Walk uses of top_stmt to ensure that all stmts are dominated by top_stmt. > */ > top_op1 = gimple_assign_rhs1 (top_stmt); > FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1) > { > if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR > && use_stmt has same operands as top_stmt) > { > if (use_stmt == top_stmt) > continue; > > /* No top_stmt exits, do not proceed with transform */ > if (top_bb does not dominate gimple_bb (use_stmt)) > return false; > > stmts.safe_push (use_stmt); > } > } > > For the case: > t1 = x / y; > if (cond) > t2 = x % y; > else > t3 = x % y; > > Assuming stmt is "t2 = x % y", it will walk uses of stmt and set > top_stmt to "t1 = x / y" > Then it will walk all uses of top_stmt: > "t2 = x % y" -> dominated by top_stmt > "t3 = x % y" -> dominated by top_stmt > Since all stmts are dominated by top_stmt, we add all three stmts to > vector of stmts and proceed with transform. > > For the case where, top_stmt dominates original stmt but not all stmts: > > if (cond) > t1 = x / y; > else > { > t2 = x % y; > return; > } > > t3 = x % y; > > Assuming stmt is "t3 = x % y", > Walking stmt uses will set top_stmt to "t1 = x / y"; > > Walking immediate uses of top_stmt, we find that "t2 = x % y" is not > dominated by top_stmt, > and hence don't do the transform. > > Does this sound reasonable ?
Yes, that's reasonable. Richard. > Thanks, > Prathamesh > > > > Richard. > > > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)