I have two versions of an example function that calculates a number by looping over all pair of points. In the first one I use a 2d-array and access points with [:,i] syntax to get the coordinates. In the second version of the function I instead creates an array of Point-types (each Point has a x and y coordinate). I then access the coordinate like point.x, point.y etc.
These two functions takes vastly different time and memory usage. This is the first function: function slow() srand(1234) points = randn(2, 5000) n_points::Int = size(points,2) cum = 0.0 for i in 1:n_points for j in (i+1):n_points point_2 = points[:, j] cum += point_2[1] end end return cum end This is the fast version with the Point types: immutable Point x::Float64 y::Float64 end function fast() srand(1234) points = randn(2, 5000) n_points = size(points, 2) cum= 0.0 # Create array of points points_vec = Point[] for i in 1:n_points push!(points_vec, Point( points [1,i], points [2,i])) end for i in 1:n_points for j in (i+1):n_points point_2 = points_vec[j] cum += point_2.x end end return cum end Running @time println(slow()) @time println(fast()) now gives: -23952.535945302105 elapsed time: 0.954317047 seconds (1055 MB allocated, 3.78% gc time in 48 pauses with 0 full sweep) -23952.535945302105 elapsed time: 0.025171914 seconds (1 MB allocated) The slow version takes 50 times longer and consumes 1000x the memory. Running the functions with memory tacker gives: - - - - function slow() 28688 srand(1234) 80048 points = randn(2, 5000) 0 n_points::Int = size(points,2) 0 cum = 0.0 0 for i in 1:n_points 0 for j in (i+1):n_points 1099780000 point_2 = points[:, j] 0 cum += point_2[1] - end - end 0 return cum - end - - - - immutable Point - x::Float64 - y::Float64 - end - - function fast() 2540964 srand(1234) 80048 points = randn(2, 5000) 0 n_points = size(points, 2) 0 cum= 0.0 - - # Create array of points 48 points_vec = Point[] 0 for i in 1:n_points 263112 push!(points_vec, Point( points [1,i], points [2,i])) - end - 0 for i in 1:n_points 0 for j in (i+1):n_points 0 point_2 = points_vec[j] 0 cum += point_2.x - end - end 0 return cum - end - - - - @time println(slow()) - @time println(fast()) - - So what seems to take all the memory is point_2 = points[:, j] Maybe some copying is performed when slicing but I have tried replacing it with sub and slice etc (that shouldnt copy?) and it just get worse. Are there some alignment issues? I have tried both in 0.3.5 and 0.4 with the same results. Any help? Best regards, Kristoffer Carlsson