Hello Andrew Wong, Grant Henke,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/15635

to review the following change.


Change subject: rowblock: use BMI instruction set when available for 
GetSelectedRows
......................................................................

rowblock: use BMI instruction set when available for GetSelectedRows

This enables a BMI variant of SelectionVector::GetSelectedRows which has
a higher throughput. I disassembled the resulting hot loop as follows:

BMI:
  L:
    tzcnt  %rsi,%rbx
    or     %r11d,%ebx
    mov    %bx,(%rdx)
    blsr   %rsi,%rsi
    tzcnt  %rsi,%rbx
    or     %r11d,%ebx
    mov    %bx,0x2(%rdx)
    blsr   %rsi,%rsi
    tzcnt  %rsi,%rbx
    or     %r11d,%ebx
    mov    %bx,0x4(%rdx)
    add    $0x6,%rdx
    blsr   %rsi,%rsi
    add    $0xfffffffd,%ecx
    jne    L

non-BMI:
  L:
    bsf    %rsi,%rax
    or     %r12d,%eax
    mov    %ax,(%rdx)
    lea    -0x1(%rsi),%rax
    and    %rsi,%rax
    bsf    %rax,%rsi
    or     %r12d,%esi
    mov    %si,0x2(%rdx)
    lea    -0x1(%rax),%rbx
    and    %rax,%rbx
    bsf    %rbx,%rax
    or     %r12d,%eax
    mov    %ax,0x4(%rdx)
    add    $0x6,%rdx
    lea    -0x1(%rbx),%rsi
    and    %rbx,%rsi
    add    $0xfffffffd,%ecx
    jne    L

... and then used llvm-mca on these assembly files across a few common
architectures to see how many cycles were required for 100 iterations of
the loop. Results are as follows:

haswell non-bmi.s: Total Cycles:      606
haswell bmi.s: Total Cycles:          382

broadwell non-bmi.s: Total Cycles     606
broadwell bmi.s: Total Cycles:        382

skylake non-bmi.s: Total Cycles:      606
skylake bmi.s: Total Cycles:          307

So, on the most recent chips, this should be about a 2x improvement in
this function. This function made up a few percent of overall CPU
consumption in some TSBS workloads, so this patch had some small but
measurable improvement on end-to-end throughput.

Change-Id: I8ec74bc5db07c18d0e36de14a2343f49fc5c2859
---
M src/kudu/common/rowblock.cc
1 file changed, 19 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/15635/1
--
To view, visit http://gerrit.cloudera.org:8080/15635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8ec74bc5db07c18d0e36de14a2343f49fc5c2859
Gerrit-Change-Number: 15635
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Andrew Wong <andrew.w...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>

Reply via email to