On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh <hsi...@ohsu.edu> wrote: > Hi All, > > I have a 1000x1000000 matrix. > The calculation I would like to do is actually very simple: for each row, > calculate the frequency of a given pattern. For example, a toy dataset is as > follows. > > Col1 Col2 Col3 Col4 > 01 02 02 00 => Freq of “02” is 0.5 > 02 02 02 01 => Freq of “02” is 0.75 > 00 02 01 01 … > > My code is quite simple as the following to find the pattern “02”. > > OccurrenceRate_Fun<-function(dataMatrix) > { > tmp<-NULL > tmpMatrix<-apply(dataMatrix,1,match,"02") > for ( i in 1: ncol(tmpMatrix)) > { > tmpRate<-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix) > tmp<-c(tmp,tmpHET) > } > rm(tmpMatrix) > rm(tmpRate) > return(tmp) > gc() > } > > The problem is the memory usage grows very fast and hard to be handled on > machines with less RAM. > Could anyone please give me some comments on how to reduce the space > complexity in this calculation?
rowMeans(dataMatrix == "02") ? Hadley -- http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.