>> I have a large matrix (dim(data) is 3000  18000). In each element are
>> one of the following character strings "0/0", "1/1", "1/2", "2/2". I
>> wanted to replace "0/0" with NA and the other three with 0,1,2
>> respectively. To accomplish just the first of these four steps I did
>> this:
>> data[data=="0/0"] <- NA
>> Which is still running after 13 hours. I have 18 GB RAM and running 64
>> bit R. What is a more efficient way to accomplish this (I've already
>> done it using sed in UNIX - but want to know how to do so in R)?
>> Thanks in advance.
> Well I just did
>       gorp <- c("0/0","1/1","1/2","2/2")
>       mung <- matrix(sample(gorp,54e6,TRUE),3000,18000)
>       mung[mung=="0/0"] <- NA
> and the whole schmear ran in under half a minute of real time.


I'll lay odds that Matthew's 'matrix' is actually a data.frame, and I'll 
not be surprised if the columns are factors. In which case

        mung2 <- as.data.frame(lapply( mung,
                        function(x) {
                                levels(x)[ levels(x)=='0/0' ] <- NA
                                x } ))

will be faster, but still not as fast as what you show with a matrix.



> > sessionInfo()
> R version 2.6.2 (2008-02-08)
> i386-apple-darwin8.10.1
> locale:
> C
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1] misc_0.0-2
> loaded via a namespace (and not attached):
> [1] rcompgen_0.1-17
> I would say that something is seriously snarled up in your system.
>       cheers,
>               Rolf Turner
