Hi Kristoffer,
Thank you for your reply.
I was used to the R language and I didn't even realize that the norm(x[:,i] 
- x[:,j]) part is the problem.
To get speed boost, I add another for loop to calculate the norm manually 
to avoid the memory allocation and it works very well.
Thank you again for your help!


On Wednesday, September 7, 2016 at 6:51:08 AM UTC-5, Kristoffer Carlsson 
> The code in Distances.jl is quite heavily optimized and uses BLAS calls 
> when possible (which it is for Euclidean metric). Your code has many 
> allocations like x = x' and norm(x[:,i] - x[:,j]).
> On Wednesday, September 7, 2016 at 1:43:11 PM UTC+2, Weicheng Zhu wrote:
>> Hi there,
>> I write a function to calculate the distance for each row of a two 
>> dimensional array and I compared it with the `pairwise` function in the 
>> Distance module.
>> Does anyone can help me to find out the reason why my function is slower 
>> than the pairwise function? I only keep the triangle elements of the 
>> distance matrix which I thought should be faster. Thanks in advance for any 
>> help:)
>> Here is the code:
>> Module Tmp
>> import DataFrames: DataFrame
>> function dist(x::Matrix)
>>     x = x'
>>     n = size(x, 2)
>>     ij::UInt = 0
>>     d = zeros(convert(Int, (n-1)*n/2))
>>     for i in 1:n
>>         for j in (i+1):n
>>             ij += 1
>>             d[ij] = norm(x[:,i] - x[:,j])
>>         end
>>     end
>>     return d
>> end
>> function dist(x::DataFrame)
>>     dist(convert(Array, dat))
>> end
>> export dist
>> end
>> using Tmp
>> using Distances
>> x = rand(100,2)
>> @time dist(x)
>> # 0.001581 seconds (29.71 k allocations: 1.399 MB)
>> @time pairwise(Euclidean(), x')
>> # 0.000318 seconds (310 allocations: 91.984 KB)

Reply via email to