Totally understandable – not everyone is going to jump into contributing code generation improvements to LLVM. If you're willing, however, you can still help LLVM (and thus indirectly Julia) by filing an issue <http://llvm.org/bugs/> requesting support for the new instruction. In particular, I suspect that providing examples of situations where it would be desirable to generate the instruction would be helpful. Otherwise this kind of thing may not be on their radar (or it might – I have no idea).
I looked at the bit manipulation instructions you're talking about – these <http://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets>, right? There's already support for some of them. For example: julia> @code_native count_ones(1) .section __TEXT,__text,regular,pure_instructions Filename: int.jl Source line: 120 push RBP mov RBP, RSP Source line: 120 popcnt RAX, RDI pop RBP ret Note the "popcnt" instruction in the middle – the rest is function prologue and epilogue. So that's good. On the other hand: julia> @code_native leading_zeros(1) .section __TEXT,__text,regular,pure_instructions Filename: int.jl Source line: 121 push RBP mov RBP, RSP mov ECX, 127 Source line: 121 bsr RAX, RDI cmove RAX, RCX xor RAX, 63 pop RBP ret The LLVM code for this same call is just this: julia> @code_llvm leading_zeros(1) define i64 @"julia_leading_zeros;84553"(i64) { top: %1 = call i64 @llvm.ctlz.i64(i64 %0, i1 false), !dbg !481 ret i64 %1, !dbg !481 } So this is clearly a case of LLVM not using the lzcnt instruction even though it seems applicable. I have no idea why it doesn't (on my machine). Similarly, if you define an "and-not" function, LLVM doesn't not use the andn instruction: julia> andn(x,y) = ~x & y andn (generic function with 1 method) julia> @code_llvm andn(3,7) define i64 @"julia_andn;84678"(i64, i64) { top: %2 = xor i64 %0, -1, !dbg !846 %3 = and i64 %1, %2, !dbg !846 ret i64 %3, !dbg !846 } julia> @code_native andn(3,7) .section __TEXT,__text,regular,pure_instructions Filename: none Source line: 1 push RBP mov RBP, RSP Source line: 1 not RDI and RDI, RSI mov RAX, RDI pop RBP ret Some of these instructions look pretty handy (I'm currently wrestling with more efficient UTF-8 character decoding, so I'm sure you can imagine why), and it would certainly be nice to have access to more of them. On Sun, Nov 23, 2014 at 2:29 AM, eric l <cdg2...@gmail.com> wrote: > First off, Thanks for the quick replies. > As I need to stay in 3.0 for now if i understand correctly I will need to > go the llvm assembly route. > Any pointers or example you recommend as my llvm knowledge is on the low > side... > > Again thanks, > > -ETL