On 09/04/2015 09:19 AM, Robert Suchanek wrote:

It appears that a possibly related issue is already reported to Bugzilla (bug 
51513)
where the branch is not optimized away, leaving the compare and branch 
instructions.

It would also appear that this should be fixed at the tree level, however, 
handling
a special case when a branch is generated, pointing to 'nowhere' block, is 
probably
more suitable.
Fixing the unnecessary branch may be something worth tackling at the tree level. I'd have to sit down with the dumps for a few minutes to be sure.

This is one of the reasons why I really dislike __builtin_unreachable. I much prefer __builtin_trap so that if by some impossible situation we get into this code we get an immediate fault rather than wandering off in the instruction stream executing whatever happens to be there.

However, I think that addressing the unnecessary branching in 51513 is independent of fixing reorg.c.



Without the fix, the generated assembly is following:

        ...
        addiu   $2,$3,-3
        sltu    $4,$2,33
        .set    noreorder
        .set    nomacro
        beq     $4,$0,$L22
        lw      $4,%lo(mips_cm_is64)($4)
        .set    macro
        .set    reorder

        sll     $4,$2,2
        ...

The load instruction should not be placed in the delay slot as it is a part of
the taken branch with an undefined behaviour. The 'sll' was expected to be in 
the slot.

This is a series of unfortunate events that causes this to happen:
1. After the expansion, the 'nowhere' block is placed after the switch 
statement and
    behaves as a fallthrough case with a diamond shape.
2. The block reordering pass randomly rearranges the nowhere block and it 
happens to be
    placed before the load instruction.
3. The eager delay slot filler firstly redirects jumps by skipping consecutive 
labels
    (ignoring barriers), pointing directly to the load. The analysis is doomed 
to failure at this
    stage as the own thread is analysed and the shift instruction is rejected 
as it uses
    conflicting resources (most likely because $4 is referenced in the load) 
but the opposite
    thread succeeds partly because the set and needed registers sets do not 
intersect when
    the resources are being computed by mark_target_live_regs ().

IMO, the fix is to recognize the empty basic block that has a code_label 
followed by
a barrier (ignoring notes and debug_insns), forbid going beyond the barrier if 
the empty block
is found in skip_consecutive_labels () and first_active_target_insn ().

The patch was cross tested on mips-img-linux-gnu and sparc-linux-gnu.


Regards,
Robert

gcc/
        * reorg.c (label_with_barrier_p): New function.
        (skip_consecutive_labels): Use it.  Don't skip the label if an empty
        block is found.
        (first_active_target_insn): Likewise.  Don't ignore the empty
        block when searching for the next active instruction.
Unless I'm missing something, I don't think we should treat empty blocks special here. Instead, just stop the search when we hit a BARRIER, regardless of what other instructions (or lack thereof) are in the block.

Jeff


Reply via email to