Yeah, that’s what I figured, I don’t even need the sin() 

julia> f() = 42
f (generic function with 1 method)

julia> @time f()
  0.001347 seconds (141 allocations: 10.266 KB)
42

julia> @time f()
  0.000002 seconds (4 allocations: 160 bytes)
42

julia> @time f()
  0.000002 seconds (4 allocations: 160 bytes)
42

Is @time counting the stack allocations as well? Otherwise I don’t see why any 
heap allocation is needed.


> On Sep 2, 2016, at 7:41 AM, Mauro <mauro...@runbox.com> wrote:
> 
> On Fri, 2016-09-02 at 13:34, Jong Wook Kim <jongw...@nyu.edu> wrote:
>> Hi Yichao, what a nice idea :)
>> 
>> But even if I write in the C++ way,  @time sqrt(1) yields 5 allocations of 
>> 176
>> bytes, and in inner loops this could be a bottleneck.
> 
> Those are just allocations for the return value of sqrt.  Consider:
> 
> julia> function f(n)
>       out = 0.0
>       for i=1:n
>       out += sin(n)
>       end
>       out
>       end
> f (generic function with 1 method)
> 
> julia> @time f(10) # warmup
>  0.000008 seconds (149 allocations: 10.167 KB)
> -5.440211108893696
> 
> julia> @time f(10)
>  0.000005 seconds (5 allocations: 176 bytes)
> -5.440211108893696
> 
> julia> @time f(10000)
>  0.000849 seconds (5 allocations: 176 bytes)
> -3056.143888882987
> 
> 
>> Is this an inevitable overhead of using ccall, or is it just a bogus that I 
>> can
>> ignore?
>> 
>> Jong Wook
>> 
>> 
>>    On Sep 2, 2016, at 7:14 AM, Yichao Yu <yyc1...@gmail.com> wrote:
>> 
>> 
>> 
>>    On Fri, Sep 2, 2016 at 7:03 AM, Jong Wook Kim <ilike...@gmail.com> wrote:
>> 
>>        Hi,
>> 
>>        I'm using Julia 0.4.6 on OSX El Capitan, and was trying to normalize
>>        each column of matrix, so that the norm of each column becomes 1. 
>> Below
>>        is a isolated and simplified version of what I'm doing:
>> 
>>        function foo1()
>>            local a = rand(1000, 10000)
>>            @time for i in 1:size(a, 2)
>>                a[:, i] /= norm(a[:, i])
>>            end
>>        end
>> 
>>        foo1()
>>        0.165662 seconds (117.44 k allocations: 232.505 MB, 37.08% gc time)
>> 
>>        I thought maybe the array copying is the problem, but this didn't help
>>        much:
>> 
>>        function foo2()
>>            local a = rand(1000, 10000)
>>            @time for i in 1:size(a, 2)
>>                a[:, i] /= norm(slice(a, :, i))
>>            end
>>        end
>> 
>>        foo2()
>>        0.131377 seconds (98.47 k allocations: 155.921 MB, 36.66% gc time)
>> 
>>        and then I figured that this ugly one runs the fastest:
>> 
>>        function foo3()
>>            local a = rand(1000, 10000)
>>            @time for i in 1:size(a, 2)
>>                setindex!(a, norm(slice(a, :, i)), :, i)
>>            end
>>        end
>> 
>>        foo3()
>>        0.013814 seconds (49.49 k allocations: 1.365 MB, 4.86% gc time)
>> 
>>        So I overheard a few times that plain for-loops are faster than
>>        vectorized code in Julia, and it seems it's allocating slightly less
>>        memory, but it's slower than the above.
>> 
>>        function foo4()
>>            local a = rand(1000, 10000)
>>            @time @inbounds for i in 1:size(a, 2)
>>                n = norm(slice(a, :, i))
>>                @inbounds for j in 1:size(a, 1)
>>                    a[j, i] /= n
>>                end
>>            end
>>        end
>> 
>>        foo4()
>>        0.055522 seconds (30.00 k allocations: 1.068 MB, 15.14% gc time)
>> 
>>        Is there a solution that is faster and less uglier than foo3() and 
>> foo4
>>        ()?
>> 
>>        Thinking of an equivalent implementation in C/C++, I should be able to
>>        write this logic without any heap allocation. Is it possible in Julia?
>> 
>> 
>>    You can write it in the way you'd write it in c++ and just don't use `norm
>>    `.
>> 
>> 
>> 
>>        Thanks,
>>        Jong Wook

Reply via email to