Totally understandable – not everyone is going to jump into contributing
code generation improvements to LLVM. If you're willing, however, you can
still help LLVM (and thus indirectly Julia) by filing an issue
<http://llvm.org/bugs/> requesting support for the new instruction. In
particular, I suspect that providing examples of situations where it would
be desirable to generate the instruction would be helpful. Otherwise this
kind of thing may not be on their radar (or it might – I have no idea).

I looked at the bit manipulation instructions you're talking about – these
<http://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets>, right?
There's already support for some of them. For example:

julia> @code_native count_ones(1)
.section __TEXT,__text,regular,pure_instructions
Filename: int.jl
Source line: 120
push RBP
mov RBP, RSP
Source line: 120
popcnt RAX, RDI
pop RBP
ret


Note the "popcnt" instruction in the middle – the rest is function prologue
and epilogue. So that's good. On the other hand:

julia> @code_native leading_zeros(1)
.section __TEXT,__text,regular,pure_instructions
Filename: int.jl
Source line: 121
push RBP
mov RBP, RSP
mov ECX, 127
Source line: 121
bsr RAX, RDI
cmove RAX, RCX
xor RAX, 63
pop RBP
ret


The LLVM code for this same call is just this:

julia> @code_llvm leading_zeros(1)

define i64 @"julia_leading_zeros;84553"(i64) {
top:
  %1 = call i64 @llvm.ctlz.i64(i64 %0, i1 false), !dbg !481
  ret i64 %1, !dbg !481
}


So this is clearly a case of LLVM not using the lzcnt instruction even
though it seems applicable. I have no idea why it doesn't (on my machine).
Similarly, if you define an "and-not" function, LLVM doesn't not use the
andn instruction:

julia> andn(x,y) = ~x & y
andn (generic function with 1 method)

julia> @code_llvm andn(3,7)

define i64 @"julia_andn;84678"(i64, i64) {
top:
  %2 = xor i64 %0, -1, !dbg !846
  %3 = and i64 %1, %2, !dbg !846
  ret i64 %3, !dbg !846
}

julia> @code_native andn(3,7)
.section __TEXT,__text,regular,pure_instructions
Filename: none
Source line: 1
push RBP
mov RBP, RSP
Source line: 1
not RDI
and RDI, RSI
mov RAX, RDI
pop RBP
ret


Some of these instructions look pretty handy (I'm currently wrestling with
more efficient UTF-8 character decoding, so I'm sure you can imagine why),
and it would certainly be nice to have access to more of them.

On Sun, Nov 23, 2014 at 2:29 AM, eric l <cdg2...@gmail.com> wrote:

> First off, Thanks for the quick replies.
> As I need to stay in 3.0 for now if i understand correctly I will need to
> go the llvm assembly route.
> Any pointers or example you recommend as my llvm knowledge is on the low
> side...
>
> Again thanks,
>
> -ETL

Reply via email to