From: Ian Romanick <ian.d.roman...@intel.com> shader-db results:
Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 15105981 -> 15090997 (-0.10%) instructions in affected programs: 977852 -> 962868 (-1.53%) helped: 4531 HURT: 0 helped stats (abs) min: 1 max: 221 x̄: 3.31 x̃: 2 helped stats (rel) min: 0.07% max: 10.00% x̄: 1.83% x̃: 1.33% 95% mean confidence interval for instructions value: -3.49 -3.13 95% mean confidence interval for instructions %-change: -1.86% -1.79% Instructions are helped. total cycles in shared programs: 566053975 -> 565827580 (-0.04%) cycles in affected programs: 14922172 -> 14695777 (-1.52%) helped: 4373 HURT: 125 helped stats (abs) min: 1 max: 6190 x̄: 54.16 x̃: 28 helped stats (rel) min: 0.01% max: 29.14% x̄: 2.74% x̃: 2.31% HURT stats (abs) min: 1 max: 1836 x̄: 83.53 x̃: 44 HURT stats (rel) min: 0.01% max: 34.36% x̄: 2.85% x̃: 1.64% 95% mean confidence interval for cycles value: -55.94 -44.73 95% mean confidence interval for cycles %-change: -2.64% -2.52% Cycles are helped. total spills in shared programs: 11085 -> 11084 (<.01%) spills in affected programs: 49 -> 48 (-2.04%) helped: 1 HURT: 0 total fills in shared programs: 23143 -> 23142 (<.01%) fills in affected programs: 92 -> 91 (-1.09%) helped: 1 HURT: 0 Haswell total instructions in shared programs: 13682846 -> 13671160 (-0.09%) instructions in affected programs: 726095 -> 714409 (-1.61%) helped: 3478 HURT: 0 helped stats (abs) min: 1 max: 221 x̄: 3.36 x̃: 2 helped stats (rel) min: 0.07% max: 10.64% x̄: 1.94% x̃: 1.47% 95% mean confidence interval for instructions value: -3.58 -3.14 95% mean confidence interval for instructions %-change: -1.99% -1.89% Instructions are helped. total cycles in shared programs: 449604795 -> 449445686 (-0.04%) cycles in affected programs: 11722803 -> 11563694 (-1.36%) helped: 3310 HURT: 138 helped stats (abs) min: 1 max: 1780 x̄: 49.75 x̃: 18 helped stats (rel) min: <.01% max: 29.13% x̄: 2.21% x̃: 1.72% HURT stats (abs) min: 1 max: 258 x̄: 40.31 x̃: 18 HURT stats (rel) min: <.01% max: 15.30% x̄: 1.79% x̃: 0.79% 95% mean confidence interval for cycles value: -50.65 -41.64 95% mean confidence interval for cycles %-change: -2.12% -1.98% Cycles are helped. LOST: 1 GAINED: 0 Ivy Bridge total instructions in shared programs: 12030700 -> 12023562 (-0.06%) instructions in affected programs: 638233 -> 631095 (-1.12%) helped: 2498 HURT: 0 helped stats (abs) min: 1 max: 39 x̄: 2.86 x̃: 2 helped stats (rel) min: 0.05% max: 10.00% x̄: 1.28% x̃: 1.02% 95% mean confidence interval for instructions value: -2.96 -2.75 95% mean confidence interval for instructions %-change: -1.32% -1.24% Instructions are helped. total cycles in shared programs: 256243891 -> 256115699 (-0.05%) cycles in affected programs: 11801287 -> 11673095 (-1.09%) helped: 2433 HURT: 53 helped stats (abs) min: 1 max: 1348 x̄: 56.44 x̃: 26 helped stats (rel) min: 0.01% max: 24.56% x̄: 1.90% x̃: 1.53% HURT stats (abs) min: 1 max: 5364 x̄: 172.36 x̃: 39 HURT stats (rel) min: 0.03% max: 51.96% x̄: 3.57% x̃: 1.18% 95% mean confidence interval for cycles value: -58.76 -44.38 95% mean confidence interval for cycles %-change: -1.86% -1.70% Cycles are helped. Sandy Bridge total instructions in shared programs: 10831556 -> 10831527 (<.01%) instructions in affected programs: 3777 -> 3748 (-0.77%) helped: 17 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.71 x̃: 1 helped stats (rel) min: 0.13% max: 9.52% x̄: 3.58% x̃: 2.73% 95% mean confidence interval for instructions value: -2.27 -1.14 95% mean confidence interval for instructions %-change: -5.14% -2.03% Instructions are helped. total cycles in shared programs: 154504015 -> 154503922 (<.01%) cycles in affected programs: 47336 -> 47243 (-0.20%) helped: 15 HURT: 8 helped stats (abs) min: 1 max: 81 x̄: 16.13 x̃: 4 helped stats (rel) min: 0.01% max: 10.59% x̄: 1.54% x̃: 0.73% HURT stats (abs) min: 14 max: 33 x̄: 18.62 x̃: 17 HURT stats (rel) min: 0.33% max: 2.78% x̄: 2.42% x̃: 2.71% 95% mean confidence interval for cycles value: -15.10 7.01 95% mean confidence interval for cycles %-change: -1.41% 1.08% Inconclusive result (value mean confidence interval includes 0). Iron Lake total instructions in shared programs: 8207937 -> 8207823 (<.01%) instructions in affected programs: 9392 -> 9278 (-1.21%) helped: 48 HURT: 0 helped stats (abs) min: 1 max: 11 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.15% max: 7.02% x̄: 2.02% x̃: 1.46% 95% mean confidence interval for instructions value: -3.05 -1.70 95% mean confidence interval for instructions %-change: -2.47% -1.57% Instructions are helped. total cycles in shared programs: 187478820 -> 187478330 (<.01%) cycles in affected programs: 260626 -> 260136 (-0.19%) helped: 34 HURT: 11 helped stats (abs) min: 2 max: 134 x̄: 15.41 x̃: 7 helped stats (rel) min: 0.03% max: 5.62% x̄: 0.75% x̃: 0.14% HURT stats (abs) min: 2 max: 8 x̄: 3.09 x̃: 2 HURT stats (rel) min: 0.12% max: 0.86% x̄: 0.32% x̃: 0.28% 95% mean confidence interval for cycles value: -18.75 -3.03 95% mean confidence interval for cycles %-change: -0.86% -0.11% Cycles are helped. GM45 total instructions in shared programs: 5047448 -> 5047391 (<.01%) instructions in affected programs: 4783 -> 4726 (-1.19%) helped: 24 HURT: 0 helped stats (abs) min: 1 max: 11 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.15% max: 6.78% x̄: 1.95% x̃: 1.41% 95% mean confidence interval for instructions value: -3.36 -1.39 95% mean confidence interval for instructions %-change: -2.58% -1.31% Instructions are helped. total cycles in shared programs: 128068274 -> 128068004 (<.01%) cycles in affected programs: 152576 -> 152306 (-0.18%) helped: 16 HURT: 7 helped stats (abs) min: 2 max: 134 x̄: 18.50 x̃: 10 helped stats (rel) min: 0.03% max: 5.62% x̄: 0.85% x̃: 0.16% HURT stats (abs) min: 2 max: 8 x̄: 3.71 x̃: 4 HURT stats (rel) min: 0.12% max: 0.86% x̄: 0.34% x̃: 0.28% 95% mean confidence interval for cycles value: -23.99 0.51 95% mean confidence interval for cycles %-change: -1.09% 0.12% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> --- src/intel/compiler/brw_fs_nir.cpp | 45 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index ef4c41da132..191cb9afde8 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -25,6 +25,7 @@ #include "brw_fs.h" #include "brw_fs_surface_builder.h" #include "brw_nir.h" +#include "nir_search_helpers.h" #include "util/u_math.h" using namespace brw; @@ -801,6 +802,41 @@ fs_visitor::emit_fsign(const fs_builder &bld, const nir_alu_instr *instr, } } +/** + * Deteremine whether sources of a nir_op_fmul can be fused with a nir_op_fsign + * + * Checks the operands of a \c nir_op_fmul to determine whether or not + * \c emit_fsign could fuse the multiplication with the \c sign() calculation. + * + * \param instr The multiplication instruction + * + * \param fsign_src The source of \c instr that may or may not be a + * \c nir_op_fsign + */ +static bool +can_fuse_fmul_fsign(nir_alu_instr *instr, unsigned fsign_src) +{ + assert(instr->op == nir_op_fmul); + + nir_alu_instr *const fsign_instr = + nir_src_as_alu_instr(&instr->src[fsign_src].src); + + /* Rules: + * + * 1. instr->src[fsign_src] must be a nir_op_fsign. + * 2. The nir_op_fsign can only be used by this multiplication. + * 3. The source that is the nir_op_fsign does not have source modifiers. + * \c emit_fsign only examines the source modifiers of the source of the + * \c nir_op_fsign. + * + * The nir_op_fsign must also not have the saturate modifier, but steps + * have already been taken (in nir_opt_algebraic) to ensure that. + */ + return fsign_instr != NULL && fsign_instr->op == nir_op_fsign && + is_used_once(fsign_instr) && + !instr->src[fsign_src].abs && !instr->src[fsign_src].negate; +} + void fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) { @@ -809,6 +845,15 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) return; } + if (instr->op == nir_op_fmul) { + for (unsigned i = 0; i < 2; i++) { + if (can_fuse_fmul_fsign(instr, i)) { + emit_fsign(bld, instr, i); + return; + } + } + } + struct brw_wm_prog_key *fs_key = (struct brw_wm_prog_key *) this->key; fs_inst *inst; -- 2.14.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev