Hi All, Here is slightly modified patch which limits a number of conditional moves instead of changing conditional branch cost. This is in fact a work-around for very poor cost model which needs to be enhanced to evaluate cost of conditional move that could be greater then cost of ordinary move (for some targets). This fix did not show any performance regressions on different x86 platforms in comparison with James patch.
Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2015-12-31 Yuri Rumyantsev <ysrum...@gmail.com> PR rtl-optimization/68920 * config/i386/i386.c (ix86_option_override_internal): Restrict number of conditional moves for RTL if-conversion to 1 for TARGET_ONE_IF_CONV_INSN. * config/i386/i386.h (TARGET_ONE_IF_CONV_INSN): New macros. * config/i386/x86-tune.def (X86_TUNE_ONE_IF_CONV_INSN): New macros. * params.def (PARAM_MAX_RTL_IF_CONVERSION_INSNS) : Introduce new parameter to restirct number of conditional moves for RTL if-conversion. * doc/invoke.texi (max-rtl-if-conversion-insns): Document it. * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Limit number of conditionl moves. gcc/testsuite/ChangeLog * gcc.dg/ifcvt-4.c: Add "--param max-rtl-if-conversion-insns=3" option for ix86 targets. * gcc.dg/ifcvt-5.c: New test. 2015-12-18 17:19 GMT+03:00 James Greenhalgh <james.greenha...@arm.com>: > On Fri, Dec 18, 2015 at 05:09:14PM +0300, Yuri Rumyantsev wrote: >> James, >> >> We implemented slightly different patch - we restrict number of SET >> instructions for if-conversion through new parameter and add check in >> bb_ok_for_noce_convert_multiple_sets: >> >> + unsigned limit = MIN (ii->branch_cost, >> + (unsigned) PARAM_VALUE (PARAM_MAX_IF_CONV_SET_INSNS)); >> .. >> - if (count > ii->branch_cost) >> - return FALSE; >> + if (count > limit) >> + return false; >> >> Now we are testing it for different suites/chips but I saw that real >> benchmark for which we saw huge performance degradation gets speed-ip >> on 65% for -march=slm -m32. > > Yes, that will work too. I put it where I did to give back-ends the choice > to turn off all if-conversion by setting the param to zero. Your fix is more > targeted to fixing just the one regression. I don't have a preference as > to which direction we take this. I saw a similar performance boost for your > testcase on my development box with my patch - so at least we now have two > candidate patches to fix the performance regression. > > Thanks, > James > >> >> 2015-12-18 16:52 GMT+03:00 James Greenhalgh <james.greenha...@arm.com>: >> > >> > Hi, >> > >> > PR68920 talks about undesirable if-conversion in the x86 back-end. >> > The if-conversion cost model simply uses BRANCH_COST (I want to revisit >> > this for GCC 7), but BRANCH_COST is overloaded in the compiler and reducing >> > it causes other optimisations to be disabled. >> > >> > Consequently, to fix the PR we want something new that the target can set >> > to override BRANCH_COST and reduce the number of instructions that get >> > if-converted. This patch adds this mechanism through a param. >> > >> > Bootstrapped on x86_64-none-linux-gnu and aarch64-none-linux-gnu. >> > >> > OK for trunk? >> > >> > Thanks, >> > James >> > >> > --- >> > gcc/ >> > >> > 2015-12-17 James Greenhalgh <james.greenha...@arm.com> >> > >> > PR rtl-optimization/68920 >> > * params.def (PARAM_MAX_RTL_IF_CONVERSION_INSNS): New. >> > * ifcvt.c (noce_find_if_block): Limit by new param. >> > * doc/invoke.texi (max-rtl-if-conversion-insns): Document it. >> > >> > gcc/testsuite/ >> > >> > 2015-12-17 James Greenhalgh <james.greenha...@arm.com> >> > >> > PR rtl-optimization/68920 >> > * gcc.deg/ifcvt-5.c: New. >> > >>
patch
Description: Binary data