On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh <hsi...@ohsu.edu> wrote:
> Hi All,
>
> I have a 1000x1000000 matrix.
> The calculation I would like to do is actually very simple: for each row, 
> calculate the frequency of a given pattern. For example, a toy dataset is as 
> follows.
>
> Col1    Col2    Col3    Col4
> 01      02      02      00              => Freq of “02” is 0.5
> 02      02      02      01              => Freq of “02” is 0.75
> 00      02      01      01              …
>
> My code is quite simple as the following to find the pattern “02”.
>
> OccurrenceRate_Fun<-function(dataMatrix)
> {
>  tmp<-NULL
>  tmpMatrix<-apply(dataMatrix,1,match,"02")
>   for ( i in 1: ncol(tmpMatrix))
>  {
>    tmpRate<-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix)
>    tmp<-c(tmp,tmpHET)
>  }
>  rm(tmpMatrix)
>  rm(tmpRate)
>  return(tmp)
>  gc()
> }
>
> The problem is the memory usage grows very fast and hard to be handled on 
> machines with less RAM.
> Could anyone please give me some comments on how to reduce the space 
> complexity in this calculation?

rowMeans(dataMatrix == "02")  ?

Hadley


-- 
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to