I encountered in my application with Distances.Jaccard compared with 
Distances.Euclidean
It was very slow.

For example with 2 vecteurs Float64 of size 11520

I get the following 
julia> D=Euclidean()
Distances.Euclidean()
julia> @time for i in 1:500
       evaluate(D,v1,v2)
       end
  0.002553 seconds (500 allocations: 7.813 KB)

and with Jaccard

julia> D=Jaccard()
Distances.Jaccard()
@time for i in 1:500
              evaluate(D,v1,v2)
              end
  1.995046 seconds (40.32 M allocations: 703.156 MB, 9.68% gc time)

With a simple loop for computing jaccard :


function myjaccard2(a::Array{Float64,1}, b::Array{Float64,1})
           num = 0
           den = 0
           for i in 1:length(a)
                   num = num + min(a[i],b[i])
                   den = den + max(a[i],b[i])      
           end
               1. - num/den
       end
myjaccard2 (generic function with 1 method)

julia> @time for i in 1:500
              myjaccard2(v1,v2)
              end
  0.451582 seconds (23.04 M allocations: 351.592 MB, 20.04% gc time)

I do not see the problem in jaccard distance implementation in the 
Distances packages

Reply via email to