https://bugs.llvm.org/show_bug.cgi?id=44279

            Bug ID: 44279
           Summary: Opportunity for reusing jccs for long branches
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedb...@nondot.org
          Reporter: listm...@philipreames.com
                CC: craig.top...@gmail.com, llvm-bugs@lists.llvm.org,
                    llvm-...@redking.me.uk, spatel+l...@rotateright.com

Ran across an interesting opportunity for reducing code size of branch dense
code without negatively impacting performance.  The basic notion is that if we
have a bunch of branches which share a branch target, we can use a short jump
to redirect to a single long jump. (i.e. it's a variation of branch islands)

(Reminder: short jumps are 2 bytes w/a byte offset, long jumps are 6 bytes w/a
4 byte offset)

Here's a little motivating IR fragment:
define void @test(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp1 = icmp slt i32 %a, %b
  %cmp2 = icmp slt i32 %c, %d
  %and = and i1 %cmp1, %cmp2
  br i1 %and, label %taken, label %untaken, !prof !{!"branch_weights", i64
999999, i64 0}
taken:
  call void @foo()
  ret void
... put code here to ensure this is far away ...
untaken:
  call void @bar()
  ret void
}

declare void @foo()
declare void @bar()


Our lowering for this today will end up looking like this:
       1: 39 f7                         cmpl    %esi, %edi
       3: 0f 8d b4 00 00 00             jge     180 <test+0xbd>
       9: 39 ca                         cmpl    %ecx, %edx
       b: 0f 8d ac 00 00 00             jge     172 <test+0xbd>

That used two long jumps, and we got 16 bytes total for the branch dispatch
sequence.

Instead, we could have done:
       0: 39 f7                         cmpl    %esi, %edi
       2: 7d 02                         jge     2 <test+0xb>
       4: 39 ca                         cmpl    %ecx, %edx
       6: 0f 8d XX XX XX XX             jge     <target>

This would have saved 4 bytes in this case.  

This transform is code size neutral if the jumps turn out to be short jumps
after all, and performance neutral for the taken fallthrough path.  (It does
slightly pessimise the untaken path.)


$ ../../build/bin/llc -O3 -mcpu=skylake < branch-island.ll |
../../build/bin/llvm-mc  -filetype=obj -triple x86_64-pc-linux-gnu - -o -
-mcpu=skylake | ../../build/bin/llvm-objdump -disassemble -

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to