I think the following code is more illustrative of the issue: julia> function foo() > result::Int = 0 > result > end > foo (generic function with 1 method) > julia> code_native(foo, ()) > .section __TEXT,__text,regular,pure_instructions > Filename: none > Source line: 2 > push RBP > mov RBP, RSP > movabs RAX, 140263599429192 > Source line: 2 > mov RAX, QWORD PTR [RAX] > Source line: 3 > pop RBP > ret > julia> function foo2() > result::Int = 0 > end > foo2 (generic function with 1 method) > julia> code_native(foo2, ()) > .section __TEXT,__text,regular,pure_instructions > Filename: none > Source line: 2 > push RBP > mov RBP, RSP > xor EAX, EAX > Source line: 2 > pop RBP > ret
For some reason in the first case Julia loads the 0 from memory whereas in the second case it simply xors EAX (instead of RAX, oddly, but not sure if it matters here). Rak. On Wed, Apr 9, 2014 at 6:32 PM, Rak Rok <rak...@gmail.com> wrote: > Excellent! Thanks guys, this is a TON better! > > I find it a little bit strange that it fetches the value of 0 from some > area in memory instead of just say xor RAX, RAX. I'm guessing it doesn't > know that the value is actually numeric 0? Is that because zero(T) isn't > being inlined fully perhaps? > > If it could realize that zero(T) is actually 0, it would probably avoid > the add altogether and simply do a mov into RAX. > > Interestingly, the code generated for Uints implies that that it knows > that zero(T) is actually just 0 and we get: > julia> code_native(f, (Uint,)) > .section __TEXT,__text,regular,pure_instructions > Filename: none > Source line: 3 > push RBP > mov RBP, RSP > test RDI, RDI > je 9 > Source line: 3 > imul RDI, RDI > jmpq 2 > xor EDI, EDI > Source line: 6 > mov RAX, RDI > pop RBP > ret > > Which is a little strange in its own way, since the xor EDI, EDI is > unecessary (and the jmpq 2 as well), but is at least better in that it > doesn't load a 0 from memory. Any clues? > > Thanks for your help guys, > Rak. > > On Thursday, April 3, 2014 5:24:27 PM UTC-4, Matt Bauman wrote: >> >> Talk about a perfect storm of fixes and enhancements! With the latest >> commit by Arch Robison, the native code is remarkably similar to clang's >> output — just load, test, and multiply and add. It's funny to read folks >> in the issue talking about how much faster the 0.3 version is… when the >> final result ended up being something like > 50 million fold faster than >> 0.2. So very impressive. >> >> You should complain about strange LLVM code more often, Rak. Here's the >> native code now: >> >> julia> code_native(f, (Int,)) >> .section __TEXT,__text,regular,pure_instructions >> Filename: none >> Source line: 2 >> push RBP >> mov RBP, RSP >> movabs RAX, 140508429342280 >> Source line: 2 >> mov RAX, QWORD PTR [RAX] >> test RDI, RDI >> jle 7 >> Source line: 3 >> imul RDI, RDI >> add RAX, RDI >> Source line: 6 >> pop RBP >> ret >> >> On Wednesday, April 2, 2014 9:26:44 PM UTC-4, andrew cooke wrote: >>> >>> >>> thanks! it's just possible this will fix a performance issue of mine :o) >>> >>> On Wednesday, 2 April 2014 16:57:36 UTC-3, Steven G. Johnson wrote: >>>> >>>> Just filed this as an issue: >>>> https://github.com/JuliaLang/julia/issues/6382 >>>> >>> julia> code_native(f, (Uint,)) > .section __TEXT,__text,regular,pure_instructions > Filename: none > Source line: 3 > push RBP > mov RBP, RSP > test RDI, RDI > je 9 > Source line: 3 > imul RDI, RDI > jmpq 2 > xor EDI, EDI > Source line: 6 > mov RAX, RDI > pop RBP > ret > >