I have a n by m dense matrix, and each row is a vector representing variating flows like stock price, and I'd like to find out the two vectors which have the highest similarity using cor(). Hence, a nested for-loop was utilized to calculate the similarity between each pair, and fill the similarity into an n by n adjacency matrix.
On Fri Nov 28 2014 at 8:49:51 PM Milan Bouchet-Valat <nalimi...@club.fr> wrote: > Le vendredi 28 novembre 2014 à 10:21 +0000, SLiZn Liu a écrit : > > I'm doing row-wise/col-wise calculation, isn't it inevitable to create > > row/col copies after iteratively extract single elements? > No, I don't think so, though sometimes you'll want to extract a full > row/column to pass it to a standard function instead of writing all of > the computations by hand. That's where array views are very useful. > > But can you give more details about the calculation you need to do? > > > Regards > > > I will consider to take a shot on option 1, ArrayViews if this > > single-element-extraction comes to a dead end. Thanks, Milan! > > > > > > > > On Fri Nov 28 2014 at 6:00:07 PM Milan Bouchet-Valat > > <nalimi...@club.fr> wrote: > > Le vendredi 28 novembre 2014 à 01:45 -0800, Todd Leo a écrit : > > > Hi Fellows, > > > > > > > > > Say I have a 1000 x 1000 matrix, and I'm going to do some > > calculation > > > in a nested for-loop, with each pair of rows/cols in the > > matrix. But I > > > suffered a heavy performance penalty in row/col extraction. > > Here's my > > > minimum reproducible example, which I hope explains itself. > > > > > > > > > A = rand(0.:0.01:1.,1000,1000) > > > > > > > > > function test(x) > > > for i in 1:1000, j in 1:1000 > > > x[:,i] > > > x[:,j] > > > end > > > end > > > > > > > > > test(A) # warm up > > > gc() > > > @time test(A) > > > ## elapsed time: 13.28547939 seconds (16208000080 bytes > > allocated, > > > 72.42% gc time) > > > > > > It takes 13 seconds, only extracting the rows/cols for the > > sake of > > > further calculations. I'm wondering if anything I could do > > to improve > > > the performance.Thanks in advance. > > This is because extracting a row/column creates a copy. > > Depending on > > what calculation you want to do on them, you can: > > - use arrays views (which will become the default when > > extracting slices > > in 0.4): https://github.com/JuliaLang/ArrayViews.jl > > - manually write loops to go over the row and column so that > > you only > > extract one individual element of the matrix at a time > > > > > > Regards > >