I think the following code is more illustrative of the issue:

julia> function foo()
>            result::Int = 0
>            result
>        end
> foo (generic function with 1 method)
> julia> code_native(foo, ())
> .section __TEXT,__text,regular,pure_instructions
> Filename: none
> Source line: 2
> push RBP
> mov RBP, RSP
> movabs RAX, 140263599429192
> Source line: 2
> mov RAX, QWORD PTR [RAX]
> Source line: 3
> pop RBP
> ret
> julia> function foo2()
>            result::Int = 0
>        end
> foo2 (generic function with 1 method)
> julia> code_native(foo2, ())
> .section __TEXT,__text,regular,pure_instructions
> Filename: none
> Source line: 2
> push RBP
> mov RBP, RSP
> xor EAX, EAX
> Source line: 2
> pop RBP
> ret


For some reason in the first case Julia loads the 0 from memory whereas in
the second case it simply xors EAX (instead of RAX, oddly, but not sure if
it matters here).

Rak.



On Wed, Apr 9, 2014 at 6:32 PM, Rak Rok <rak...@gmail.com> wrote:

> Excellent! Thanks guys, this is a TON better!
>
> I find it a little bit strange that it fetches the value of 0 from some
> area in memory instead of just say xor RAX, RAX. I'm guessing it doesn't
> know that the value is actually numeric 0? Is that because zero(T) isn't
> being inlined fully perhaps?
>
> If it could realize that zero(T) is actually 0, it would probably avoid
> the add altogether and simply do a mov into RAX.
>
> Interestingly, the code generated for Uints implies that that it knows
> that zero(T) is actually just 0 and we get:
> julia> code_native(f, (Uint,))
>     .section __TEXT,__text,regular,pure_instructions
> Filename: none
> Source line: 3
>     push RBP
>     mov RBP, RSP
>     test RDI, RDI
>     je 9
> Source line: 3
>     imul RDI, RDI
>     jmpq 2
>     xor EDI, EDI
> Source line: 6
>     mov RAX, RDI
>     pop RBP
>     ret
>
> Which is a little strange in its own way, since the xor EDI, EDI is
> unecessary (and the jmpq 2 as well), but is at least better in that it
> doesn't load a 0 from memory. Any clues?
>
> Thanks for your help guys,
> Rak.
>
> On Thursday, April 3, 2014 5:24:27 PM UTC-4, Matt Bauman wrote:
>>
>> Talk about a perfect storm of fixes and enhancements!  With the latest
>> commit by Arch Robison, the native code is remarkably similar to clang's
>> output — just load, test, and multiply and add.  It's funny to read folks
>> in the issue talking about how much faster the 0.3 version is… when the
>> final result ended up being something like > 50 million fold faster than
>> 0.2.  So very impressive.
>>
>> You should complain about strange LLVM code more often, Rak.  Here's the
>> native code now:
>>
>> julia> code_native(f, (Int,))
>> .section __TEXT,__text,regular,pure_instructions
>> Filename: none
>> Source line: 2
>> push RBP
>> mov RBP, RSP
>> movabs RAX, 140508429342280
>> Source line: 2
>> mov RAX, QWORD PTR [RAX]
>> test RDI, RDI
>> jle 7
>> Source line: 3
>> imul RDI, RDI
>> add RAX, RDI
>> Source line: 6
>> pop RBP
>> ret
>>
>> On Wednesday, April 2, 2014 9:26:44 PM UTC-4, andrew cooke wrote:
>>>
>>>
>>> thanks!  it's just possible this will fix a performance issue of mine :o)
>>>
>>> On Wednesday, 2 April 2014 16:57:36 UTC-3, Steven G. Johnson wrote:
>>>>
>>>> Just filed this as an issue:
>>>>     https://github.com/JuliaLang/julia/issues/6382
>>>>
>>> julia> code_native(f, (Uint,))
>  .section __TEXT,__text,regular,pure_instructions
> Filename: none
> Source line: 3
>  push RBP
>  mov RBP, RSP
>  test RDI, RDI
>  je 9
> Source line: 3
>  imul RDI, RDI
>  jmpq 2
>  xor EDI, EDI
> Source line: 6
>  mov RAX, RDI
>  pop RBP
>  ret
>
>

Reply via email to