On Tue, Sep 4, 2012 at 10:58 AM, Martin Maechler <maech...@stat.math.ethz.ch> wrote: >>>>>> Jennifer Lyon <jennifer.s.l...@gmail.com> >>>>>> on Fri, 31 Aug 2012 17:22:57 -0600 writes: > > > Hi: > > I was trying to use apply on a sparse matrix from package Matrix, > > and I get the error: > > > Error in asMethod(object) : > > Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line > 106 > > > Is there a way to apply a function to all the rows without bumping > > into this problem? > > > Here is a simplified example: > > >> dim(sm) > > [1] 72913 43052 > > >> class(sm) > > [1] "dgCMatrix" > > attr(,"package") > > [1] "Matrix" > > >> str(sm) > > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > > ..@ i : int [1:6590004] 789 801 802 1231 1236 11739 17817 > > 17943 18148 18676 ... > > ..@ p : int [1:43053] 0 147 303 450 596 751 908 1053 1188 1347 ... > > ..@ Dim : int [1:2] 72913 43052 > > ..@ Dimnames:List of 2 > > .. ..$ : NULL > > .. ..$ : NULL > > ..@ x : num [1:6590004] 0.601 0.527 0.562 0.641 0.684 ... > > ..@ factors : list() > > >> my.sum<-apply(sm, 1, sum) > > Error in asMethod(object) : > > Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line > 106 > > So, actually it would have worked (though not efficiently) if > your sm matrix would have been much smaller. > > However, we provide rowSums(), rowMeans(), colSums(), colMeans() > for all of our matrices, including the sparse ones. > > So your present problem can be solved using > > my.sum <- rowSums(sm) > > Best regards, > Martin Maechler, ETH Zurich
Thank you for letting me know about rowSums(). Two points. First, sadly, I was unclear in my posting, and using "sum" was just an example. In the real case I am using my own function on each row. I guess the answer for this problem is that iteration is my friend. Good to know. Second, since I'm embarrassed to say I hadn't remembered rowSums(), for cases when I needed the sum of the rows, I had just been postmultiplying by a vector of 1's. Just FYI, I thought I should try rowSums(), so did a small timing trial, and it appears postmultiplying is faster than rowSums. Run is as follows: > str(sm) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:6590004] 721 926 1275 1791 2370 2755 3393 4638 5363 5566 ... ..@ p : int [1:43053] 0 147 303 450 596 751 908 1053 1188 1347 ... ..@ Dim : int [1:2] 72913 43052 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:6590004] 0.0735 0.3206 0.1861 0.1604 0.197 ... ..@ factors : list() > library(rbenchmark) #Just checking how expensive building a vector of 1's is - not very #at least for matrix of the size I'm interested in > benchmark(i1<-rep(1, ncol(sm))) test replications elapsed relative user.self sys.self 1 i1 <- rep(1, ncol(sm)) 100 0.119 1 0.12 0 user.child sys.child 1 0 0 #Postmultiplying by 1's timing > benchmark(la<-sm %*% i1) test replications elapsed relative user.self sys.self user.child 1 la <- sm %*% i1 100 5.993 1 5.993 0 0 sys.child 1 0 #rowSums timing > benchmark(la1<-rowSums(sm)) test replications elapsed relative user.self sys.self 1 la1 <- rowSums(sm) 100 28.117 1 28.114 0.004 user.child sys.child 1 0 0 #Make sure the results are the same > all(la==la1) [1] TRUE The Matrix package is awesome, and I appreciate you taking the time to answer my questions. Jen > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rbenchmark_0.3.1 Matrix_1.0-6 lattice_0.20-6 loaded via a namespace (and not attached): [1] grid_2.15.1 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.