https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116564
Alex Coplan <acoplan at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
--- Comment #5 from Alex Coplan <acoplan at gcc dot gnu.org> ---
So I did some debugging of this issue. I think the problem is a disagreement
between the dce.cc:fast_dce code and the code in
df-problems.cc:df_lr_bb_local_compute. Specifically, they disagree on the
treatment of partial clobbers.
Eliding the large block comment for readability, the code in
df_lr_bb_local_compute does:
FOR_EACH_INSN_INFO_DEF (def, insn_info)
{
unsigned int dregno = DF_REF_REGNO (def);
bitmap_set_bit (&bb_info->def, dregno);
if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
bitmap_set_bit (&bb_info->use, dregno);
else
bitmap_clear_bit (&bb_info->use, dregno);
}
for each insn. So in the case of:
(insn 10 8 13 3 (clobber (subreg:V1DF (reg/v:V2x1DF 104 [ __val ]) 8)) -1
(nil))
we treat this is a use of r104, since the corresponding DF def has the
DF_REF_PARTIAL flag set. However, the logic in dce.cc:dce_process_block
(called from fast_dce) has no such provision for partial defs. The logic there
is instead:
FOR_BB_INSNS_REVERSE (bb, insn)
if (DEBUG_INSN_P (insn))
[...]
else if (INSN_P (insn))
{
bool needed = marked_insn_p (insn);
/* The insn is needed if there is someone who uses the output. */
if (!needed)
FOR_EACH_INSN_DEF (def, insn)
if (bitmap_bit_p (local_live, DF_REF_REGNO (def))
|| bitmap_bit_p (au, DF_REF_REGNO (def)))
{
needed = true;
mark_insn (insn, true);
break;
}
/* No matter if the instruction is needed or not, we remove
any regno in the defs from the live set. */
df_simulate_defs (insn, local_live);
/* On the other hand, we do not allow the dead uses to set
anything in local_live. */
if (needed)
df_simulate_uses (insn, local_live);
[...]
}
where df_simulate_defs is just:
void
df_simulate_defs (rtx_insn *insn, bitmap live)
{
df_ref def;
FOR_EACH_INSN_DEF (def, insn)
{
unsigned int dregno = DF_REF_REGNO (def);
/* If the def is to only part of the reg, it does
not kill the other defs that reach here. */
if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
bitmap_clear_bit (live, dregno);
}
}
so the main loop in dce_process_block doesn't count i10 above as using r104, it
doesn't make r104 live.
The "did something happen" condition in dce_process_block is:
block_changed = !bitmap_equal_p (local_live, DF_LR_IN (bb));
if (block_changed)
bitmap_copy (DF_LR_IN (bb), local_live);
BITMAP_FREE (local_live);
return block_changed;
so we compare what we computed locally against the DF_LR_IN set. Due to the
above-mentioned discrepancy in the treatment of partial clobbers, for the above
testcase, DF_LR_IN has r104 set, but local_live doesn't, hence we update
DF_LR_IN to remove r104, and return true.
Eventually we enter the if (global_changed) block in fast_dce, which does:
/* We do not need to rescan any instructions. We only need
to redo the dataflow equations for the blocks that had a
change at the top of the block. Then we need to redo the
iteration. */
if (word_level)
df_analyze_problem (df_word_lr, all_blocks, postorder, n_blocks);
else
df_analyze_problem (df_lr, all_blocks, postorder, n_blocks);
which ultimately calls df_lr_bb_local_compute again, thus re-adding r104 to the
DF_LR_IN sets. This cycle causes us to spin indefinitely (removing r104 in
fast_dce and having it re-added by df_analyze_problem ->
df_lr_bb_local_compute).
I wonder if the code in df_simulate_defs ought not to be updated to match the
treatment in df_lr_bb_local_compute (i.e. to set the live bit for partial
defs). CCing Richard S as it looks like he last touched the relevant code in
df_lr_bb_local_compute.