Its a race! I decided to go ahead and time everyone's results, and all of the method's (except mine) are around the same speed. I ran them a few times and Gabor's application of melt() tends to be a tad bit faster than the other two, although that is far from conclusive -- do these methods share code in common? Should I expect one to of these to have a smaller during-processing footprint than the others? Thanks again!
--j # My terrible approach: my_matrix=matrix(c(1:60),nrow=600,ncol=100) id_m=seq(10,6000,by=10) id_n=seq(100,10000,by=100) system.time( for (a in 1:length(id_m)) { for (b in 1:length(id_n)) { if ((a==1) && (b==1)) { my_database=c(id_m[a],id_n[b],my_matrix[a,b]) } else { my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) } } } ) user system elapsed 173.601 10.288 202.433 # Gabor's method with reshape library(reshape) my_matrix = matrix(c(1:60),nrow=600,ncol=100,dimnames=list(seq(10,6000,by=10),seq(100,10000,by=100))) system.time( my_database <- melt(my_matrix) ) user system elapsed 0.006 0.006 0.014 # Jorge's method with as.data.frame.table my_matrix = matrix(c(1:60),nrow=600,ncol=100,dimnames=list(seq(10,6000,by=10),seq(100,10000,by=100))) system.time( my_database <- as.data.frame.table(my_matrix) ) user system elapsed 0.027 0.005 0.036 # Bill's method with expand.grid my_matrix=matrix(c(1:60),nrow=600,ncol=100) id_m=seq(10,6000,by=10) id_n=seq(100,10000,by=100) system.time( my_database <- cbind( expand.grid(id_m = id_m, id_n = id_n), mat = as.vector(my_matrix) ) ) user system elapsed 0.007 0.006 0.020 On Mon, Jun 7, 2010 at 9:30 PM, <bill.venab...@csiro.au> wrote: > I think what you are groping for is something like this > > my_matrix <- matrix(1:60, nrow = 6) > id_a <- seq(10,60,by=10) > id_b <- seq(100,1000,by=100) > my_database <- cbind( > expand.grid(id_a = id_a, id_b = id_b), > mat = as.vector(my_matrix) > ) > > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Jonathan Greenberg > Sent: Tuesday, 8 June 2010 12:34 PM > To: r-help > Subject: [R] Matrix to "database" -- best practices/efficiency? > > I have a matrix of, say, M and N dimensions: > > my_matrix=matrix(c(1:60),nrow=6,ncol=10) > > I have two "id" vectors corresponding to the rows and columns, e.g.: > > id_m=seq(10,60,by=10) > id_n=seq(100,1000,by=100) > > I would like to create a "proper" database (let's say a data.frame for > this example -- i'm going to be loading these into an SQLite database, > but we'll leave that complication out of this discussion for now) of m > x n rows, and 3 columns, where the 3 columns relate to the values from > m, n, and my_matrix respectively, e.g. a single row follows the form: > > c(id_m[a],id_n[b],my_matrix[a,b]) > > I can, of course, for-loop this thing with an if-then, e.g.: > > *** > > for (a in 1:length(id_m)) > { > for (b in 1:length(id_n)) > { > if ((a==1) && (b==1)) > { > my_database=c(id_m[a],id_n[b],my_matrix[a,b]) > } else > { > > my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) > } > } > } > > *** > > But my gut is telling me this is an incredibly inefficient way of > doing this -- is there a faster approach to doing this same process? > Thanks! > > --j > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.