Dear Duncan Thanks for your suggestion, but I really need sparse matrices: I have implemented various graph algorithms based on adjacency matrices. For large graphs, storing all the 0's in an adjacency matrices become "uneconomical", and therefore I thought I would use sparse matrices but the speed of "[i,j]" will slow down the algorithms. However, using RcppEigen it is possible to mimic "[i,j]" with a slowdown of "only" a factor 16 which is much better than what is obtained when using "[i,j]":
> benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj), + columns=c("test", "replications", "elapsed", "relative"), replications=5) test replications elapsed relative 1 lookup(mm, `[`) 5 0.05 1.0 2 lookup(MM, `[`) 5 23.54 470.8 3 lookup(MM, Xiijj) 5 0.84 16.8 The code for producing the result is given below. Best regards, Søren --------------------- library(inline) library(RcppEigen) library(rbenchmark) library(Matrix) src <- ' using namespace Rcpp; typedef Eigen::SparseMatrix<double> MSpMat; const MSpMat X(as<MSpMat>(XX_)); int i = as<int>(ii_)-1; int j = as<int>(jj_)-1; double ans = X.coeff(i,j); return(wrap(ans)); ' Xiijj <- cxxfunction(signature(XX_="matrix", ii_="integer", jj_="integer"), body=src, plugin="RcppEigen") mm <- matrix(c(1,0,0,0,0,0,0,0), nr=100, nc=100) MM <- as(mm, "Matrix") object.size(mm) object.size(MM) lookup <- function(mat, func){ for (i in 1:nrow(mat)){ for (j in 1:ncol(mat)){ v<-func(mat,i,j) } } } benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj), columns=c("test", "replications", "elapsed", "relative"), replications=5) -------------------------------- -----Original Message----- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: 25. juni 2012 11:27 To: Søren Højsgaard Cc: r-help@r-project.org Subject: Re: [R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"? On 12-06-24 4:50 PM, Søren Højsgaard wrote: > Dear all, > > Indexing matrices from the Matrix package with [i,j] seems to be very slow. > For example: > > library(rbenchmark) > library(Matrix) > mm<- matrix(c(1,0,0,0,0,0,0,0), nr=20, nc=20) > MM<- as(mm, "Matrix") > lookup<- function(mat){ > for (i in 1:nrow(mat)){ > for (j in 1:ncol(mat)){ > mat[i,j] > } > } > } > > benchmark(lookup(mm), lookup(MM), columns=c("test", "replications", > "elapsed", "relative"), replications=50) > test replications elapsed relative > 1 lookup(mm) 50 0.01 1 > 2 lookup(MM) 50 8.77 877 > > I would have expected a small overhead when indexing a matrix from the Matrix > package, but this result is really surprising... > Does anybody know if there are faster alternatives to [i,j] ? There's also a large overhead when indexing a dataframe, though Matrix appears to be slower. It's designed to work on whole matrices at a time, not single entries. So I'd suggest that if you need to use [i,j] indexing, then try to arrange your code to localize the access, and extract a submatrix as a regular fast matrix first. (Or if it will fit in memory, convert the whole thing to a matrix just for the access. If I just add the line mat <- as.matrix(mat) at the start of your lookup function, it becomes several hundred times faster.) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.