On 11/26/13 13:30, Steven Bosscher wrote:
On Tue, Nov 26, 2013 at 8:20 PM, Jeff Law wrote:

The jump threading changes have exposed a latent bug on machines with
conditional execution such as the ARM.

Going into the last conditional execution pass we have:

[ ... ]
(insn 16 60 17 2 (set (reg:CC 100 cc)
         (compare:CC (reg:SI 1 r1 [121])
             (const_int 0 [0]))) j.c:14 226 {*arm_cmpsi_insn}
      (expr_list:REG_DEAD (reg:SI 1 r1 [121])
         (nil)))
(jump_insn 17 16 19 2 (set (pc)
         (if_then_else (ne (reg:CC 100 cc)
                 (const_int 0 [0]))
             (label_ref 25)
             (pc))) j.c:14 235 {arm_cond_branch}
      (expr_list:REG_DEAD (reg:CC 100 cc)
         (int_list:REG_BR_PROB 5000 (nil)))
  -> 25)
;;  succ:       4 [50.0%]
;;              3 [50.0%]  (FALLTHRU)


; basic block 3, loop depth 0, count 0, freq 7500, maybe hot
;; Invalid sum of incoming frequencies 5000, should be 7500
;;  prev block 2, next block 4, flags: (RTL)
;;  pred:       2 [50.0%]  (FALLTHRU)
(note 19 17 18 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(note/s 18 19 20 3 ("lab2") NOTE_INSN_DELETED_LABEL 3)
;;  succ:


(note/s 20 18 21 ("lab") NOTE_INSN_DELETED_LABEL 4)
(barrier 21 20 25)


;; basic block 4, loop depth 0, count 0, freq 0
;; Invalid sum of incoming frequencies 5000, should be 0
;;  prev block 3, next block 5, flags: (RTL)
;;  pred:       2 [50.0%]
(code_label 25 21 26 4 2 "" [1 uses])
(note 26 25 27 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
;;  succ:       5 [100.0%]  (FALLTHRU)


;; basic block 5, loop depth 0, count 0, freq 2500
;;  prev block 4, next block 1, flags: (RTL)
;;  pred:       4 [100.0%]  (FALLTHRU)
;;              5 [100.0%]  (DFS_BACK)
(code_label 27 26 28 5 5 "" [1 uses])
(note 28 27 56 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(jump_insn 56 28 57 5 (set (pc)
         (label_ref 27)) 249 {*arm_jump}
      (nil)
  -> 27)
;;  succ:       5 [100.0%]  (DFS_BACK)

(barrier 57 56 58)
(note 58 57 0 NOTE_INSN_DELETED)


Note carefully BB3 which is the result of a __builtin_unreachable in the
original source.  It has no code, no successors and a trailing barrier (all
as expected).

ce3 comes along and creates this mess:

[ ... ]

(insn 16 60 18 2 (set (reg:CC 100 cc)
         (compare:CC (reg:SI 1 r1 [121])
             (const_int 0 [0]))) j.c:14 226 {*arm_cmpsi_insn}
      (expr_list:REG_DEAD (reg:SI 1 r1 [121])
         (nil)))
(note/s 18 16 20 2 ("lab2") NOTE_INSN_DELETED_LABEL 3)
;;  succ:       5 [100.0%]  (FALLTHRU)

(note/s 20 18 21 ("lab") NOTE_INSN_DELETED_LABEL 4)
(barrier 21 20 27)

;; basic block 5, loop depth 0, count 0, freq 2500
;; Invalid sum of incoming frequencies 12500, should be 2500
;;  prev block 2, next block 1, flags: (RTL)
;;  pred:       2 [100.0%]  (FALLTHRU)
;;              5 [100.0%]  (DFS_BACK)
(code_label 27 21 28 5 5 "" [1 uses])
(note 28 27 56 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(jump_insn 56 28 57 5 (set (pc)
         (label_ref 27)) 249 {*arm_jump}
      (nil)
  -> 27)
;;  succ:       5 [100.0%]  (DFS_BACK)

(barrier 57 56 58)
(note 58 57 0 NOTE_INSN_DELETED)


Note how it's removed the conditional jump at the end of bb2.  BB2 still has
an outgoing edge, to BB5, but there's a barrier after BB2.

This, of course, trips a checking failure.

I've been running into more of these.

<rant>
BARRIERs should not exist as long as we have a CFG. The CFG has all
the information.
</rant>

This is one of my goals for the next stage1.
Excellent. BARRIERS are from a era before we had a CFG. I will lose no sleep when they're gone.




ISTM that when we merge blocks due to if-conversion of this code of code,
we're always going to have an outgoing edge from the merged block (sanity
check, right?)

Something along these lines I think.  However, I'm not at all familiar with
this code, so some sanity checking would be greatly appreciated.

I'm surprised if-conversion applies here at all. You basically have an
incomplete diamond and that's only partially handled by ifcvt.
Yea. I was surprised too, but then realized if-converting a block with no-successors is still useful.

I have a hack here (bootstrapped and tested even) which filters out this case and just leaves things alone. But after pondering it overnight I felt that code was just papering over the real problem.

There's nothing that says that an if-converted block has to have successors. For example, it might be an if-converted block that ends in a call to abort, which gets turned into a conditional call to abort.



There is this comment:

  /* If the THEN block has no successors, conditional execution can still
      make a conditional call.  Don't do this unless the ELSE block has
      only one incoming edge -- the CFG manipulation is too ugly otherwise.
      Check for the last insn of the THEN block being an indirect jump, which
      is listed as not having any successors, but confuses the rest of the CE
      code processing.  ??? we should fix this in the future.  */

and it is assumed that there is a noreturn call in the block without
successors. In other words, the THEN block is really a dead end in the
CFG and it cannot be merged away unless the noreturn call is
if-converted away. In the case of builtin_unreachable, obviously there
isn't a call, but I'm guessing your test case goes through this path
anyway.
It does. That's precisely where I put my hack to just avoid optimizing this case.


I believe the proper fix would be to not recognize this as an
if-conversion block candidate in cond_exec_find_if_block.
That's easy enough to do, but leaves a fair amount of useless cruft in the IL and ultimately the resulting code. If instead we recognize this case and remove the BARRIER appropriately, all the cruft just disappears.

jeff

Reply via email to