Re: Would like to make one version of is_too_expensive in gcse.c and cprop.c
On 11/10/2015 07:46 PM, Bradley Lucier wrote: The routines declared as static bool is_too_expensive (const char *pass) in both cprop.c and gcse.c are identical except for two comment lines. I'd like to modify is_too_expensive, which implied to me that there should be only one copy of the routine. Would it be reasonable to add an extern declaration of is_too_expensive (with perhaps a more descriptive name) in gcse.h, remove the static declaration from gcse.c, include gcse.h in cprop.c? (I realize this is a very simple question for regular C programmers, but I'm not a regular C programmer, sorry). Yes, this would be fine. In fact, removing duplicates of this nature is a good thing. jeff
Would like to make one version of is_too_expensive in gcse.c and cprop.c
The routines declared as static bool is_too_expensive (const char *pass) in both cprop.c and gcse.c are identical except for two comment lines. I'd like to modify is_too_expensive, which implied to me that there should be only one copy of the routine. Would it be reasonable to add an extern declaration of is_too_expensive (with perhaps a more descriptive name) in gcse.h, remove the static declaration from gcse.c, include gcse.h in cprop.c? (I realize this is a very simple question for regular C programmers, but I'm not a regular C programmer, sorry). Thanks. Brad
Re: [RFC PR43721] Optimize a/b and a%b to single divmod call
Richard Biener writes: > On Mon, 9 Nov 2015, Prathamesh Kulkarni wrote: >> c) Gating the divmod transform - >> I tried gating it on checks for optab_handlers on div and mod, however >> this doesn't enable transform for arm cortex-a9 >> anymore (cortex-a9 doesn't have hardware instructions for integer div >> and mod). >> IIUC for cortex-a9, >> optab_handler (sdivmod_optab, SImode) returns CODE_FOR_nothing because >> HAVE_divsi3 is 0. >> However optab_handler (smod_optab, SImode) matches since optab_handler >> only checks for existence of pattern >> (and not whether the pattern gets matched). >> I suppose we should enable the transform only if the divmod, div, and >> mod pattern do not match rather than checking >> if the patterns exist via optab_handler ? For a general x % y, modsi3 >> would fail to match but optab_handler(smod_optab, SImode ) still >> says it's matched. > > Ah, of course. Querying for an optab handler is just a cheap > guesstimate... Not sure how to circumvent this best (sub-target > enablement of patterns). RTL expansion just goes ahead (of course) > and sees if expansion eventually fails. Richard? Yeah, FAILs in expanders are kind of awkward these days. The ARM modsi3 could be a bit more helpful by having a predicate that only accepts power-of-2 integers instead of any const_int_operand, which would at least avoid having to generate insns. I did wonder once whether the generators should provide a way of automatically fusing separate .md constructs into a single optab. That might avoid more complicated FAILs and could be used to automatically generate information for the tree level. >> Should we define a new target hook combine_divmod, which returns true >> if transforming to divmod is desirable for that >> target ? >> The default definition could be: >> bool default_combine_divmod (enum machine_mode mode, tree op1, tree op2) >> { >> // check for optab_handlers for div/mod/divmod and libfunc for divmod >> } >> >> And for arm, it could be over-ridden to return false if op2 is >> constant and <= 0 or power of 2. >> I am not really sure if this is a good idea since I am replicating >> information from modsi3 pattern. >> Any change to the pattern may require corresponding change to the hook :/ > > Yeah, I don't think that is desirable. Ideally we'd have a way > to query HAVE_* for CODE_FOR_* which would mean target-insns.def > support for all div/mod/divmod patterns(?) and queries... > > Not sure if what would be enough though. This kind of stuff should really go through optabs rather than target-insns.def. target-insns.def is more for cases where there's no distinction between operand modes. > Note that the divmod check is equally flawed. > > I think with the above I'd enable the transform when > > + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing > + || (optab_libfunc (divmod_optab, mode) != NULL_RTX >&& optab_handler ([su]div_optab, mode) == CODE_FOR_nothing)) > +return false; > > so we either will have a divmod instruction (hopefully not sub-target > disabled for us) or a libfunc for divmod and for sure no HW divide > instruction (HW mod can be emulated by HW divide but not the other > way around). Sounds good to me FWIW. Thanks, Richard
gcc-5-20151110 is now available
Snapshot gcc-5-20151110 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/5-20151110/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch revision 230137 You'll find: gcc-5-20151110.tar.bz2 Complete GCC MD5=7e1495974bb879abaff0a72fb11f3a49 SHA1=3943321002bbcdaf0f024fbd533ec387f7a6a82e Diffs from 5-20151103 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: [RFC PR43721] Optimize a/b and a%b to single divmod call
On Mon, 9 Nov 2015, Prathamesh Kulkarni wrote: > On 4 November 2015 at 20:35, Richard Biener wrote: > > > > Btw, did you investigate code gen differences on x86_64/i586? That > > target expands all divisions/modulo ops via divmod, relying on CSE > > solely as the HW always computes both div and mod (IIRC). > x86_64 has optab_handler for divmod defined, so the transform won't > take place on x86. Ok. > > + > > +gassign *assign_stmt = gimple_build_assign (gimple_assign_lhs > > (use_stmt), rhs); > > +gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); > > > > Ick. Please use > > > > gimple_set_rhs_from_tree (use_stmt, res); > Um there doesn't seem to be gimple_set_rhs_from_tree. > I used gimple_assign_set_rhs_from_tree which requires gsi for use_stmt. > Is that OK ? Yes. > > update_stmt (use_stmt); > > if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt)) > > cfg_changed = true; > > > > + free_dominance_info (CDI_DOMINATORS); > > > > do not free dominators. > > I have done the suggested changes in the attached patch. > I have a few questions: > > a) Does the change to insert DIVMOD call before topmost div or mod > stmt with matching operands > look correct ? + /* Insert call-stmt just before the topmost div/mod stmt. + top_bb dominates all other basic blocks containing div/mod stms + so, the topmost stmt would be the first div/mod stmt with matching operands + in top_bb. */ + + gcc_assert (top_bb != 0); + gimple_stmt_iterator gsi; + for (gsi = gsi_after_labels (top_bb); !gsi_end_p (gsi); gsi_next (&gsi)) +{ + gimple *g = gsi_stmt (gsi); + if (is_gimple_assign (g) + && (gimple_assign_rhs_code (g) == TRUNC_DIV_EXPR +|| gimple_assign_rhs_code (g) == TRUNC_MOD_EXPR) + && operand_equal_p (op1, gimple_assign_rhs1 (g), 0) + && operand_equal_p (op2, gimple_assign_rhs2 (g), 0)) + break; Looks overly complicated to me. Just remember "topmost" use_stmt alongside top_bb (looks like you'll no longer need top_bb if you retail top_stmt). And then do gsi = gsi_for_stmt (top_stmt); and insert before that. > b) Handling constants - I dropped handling constants in the attached > patch. IIUC we don't want > to enable this transform if there's a specialized expansion for some > constants for div or mod ? See expand_divmod which has lots of special cases for constant operands not requiring target support for div or mod. > I suppose this would also be target dependent and require a target hook ? > For instance arm defines modsi3 pattern to expand mod when 2nd operand > is constant and <= 0 or power of 2, > while for other cases goes the expand_divmod() route to generate call > to __aeabi_idivmod libcall. Ok, so it lacks a signed mod instruction. > c) Gating the divmod transform - > I tried gating it on checks for optab_handlers on div and mod, however > this doesn't enable transform for arm cortex-a9 > anymore (cortex-a9 doesn't have hardware instructions for integer div and > mod). > IIUC for cortex-a9, > optab_handler (sdivmod_optab, SImode) returns CODE_FOR_nothing because > HAVE_divsi3 is 0. > However optab_handler (smod_optab, SImode) matches since optab_handler > only checks for existence of pattern > (and not whether the pattern gets matched). > I suppose we should enable the transform only if the divmod, div, and > mod pattern do not match rather than checking > if the patterns exist via optab_handler ? For a general x % y, modsi3 > would fail to match but optab_handler(smod_optab, SImode ) still > says it's matched. Ah, of course. Querying for an optab handler is just a cheap guesstimate... Not sure how to circumvent this best (sub-target enablement of patterns). RTL expansion just goes ahead (of course) and sees if expansion eventually fails. Richard? > Should we define a new target hook combine_divmod, which returns true > if transforming to divmod is desirable for that > target ? > The default definition could be: > bool default_combine_divmod (enum machine_mode mode, tree op1, tree op2) > { > // check for optab_handlers for div/mod/divmod and libfunc for divmod > } > > And for arm, it could be over-ridden to return false if op2 is > constant and <= 0 or power of 2. > I am not really sure if this is a good idea since I am replicating > information from modsi3 pattern. > Any change to the pattern may require corresponding change to the hook :/ Yeah, I don't think that is desirable. Ideally we'd have a way to query HAVE_* for CODE_FOR_* which would mean target-insns.def support for all div/mod/divmod patterns(?) and queries... Not sure if what would be enough though. Note that the divmod check is equally flawed. I think with the above I'd enable the transform when + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing + || (optab_libfunc (divmod_optab, mode) != NULL_RTX && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing)) +return false;
Documentation of new overflow arithmetics patterns
Hi, new patterns to implement overflow arithmetics were added recently (addv4, subv4, mulv4, umulv4, negv3) and they are not documented in the internal manual as far as I can see. Granted, the old patterns (addv3, subv3, mulv3, divv3, negv2) are not documented either but they were not very useful, contrary to the new ones. Any particular reason not to document them? -- Eric Botcazou