On Thu, 14 Aug 2014, Yuri Rumyantsev wrote:
> Hi All,
>
> Here is a fix for PR 62011 - remove false dependency for unary
> bit-manipulation instructions for latest BigCore chips (Sandybridge
> and Haswell) by outputting in assembly file zeroing destination
> register before bmi instruction. I checked that performance restored
> for popcnt, lzcnt and tzcnt instructions.
I am not an x86 reviewer, but one thing looks a bit superfluous to me:
> +/* Retirn true if we need to insert before bit-manipulation instruction
note typo^
> + zeroing of its destination register. */
> +bool
> +ix86_avoid_false_dep_for_bm (rtx insn, rtx operands[])
> +{
> + unsigned int regno0;
> + df_ref use;
> + if (!TARGET_AVOID_FALSE_DEP_FOR_BM || optimize_function_for_size_p (cfun))
> + return false;
> + regno0 = true_regnum (operands[0]);
> + /* Check if insn does not use REGNO0. */
> + FOR_EACH_INSN_USE (use, insn)
> + if (regno0 == DF_REF_REGNO (use))
> + return false;
> + return true;
> +}
The loop is to prevent adding the xor when the dest operand is also the source
operand. Looks like a simpler "reg_or_subregno (operands[0]) ==
reg_or_subregno (operands[1])" could be used here, as long as the assumption
that this is called only for two-operand instruction holds?
Alexander