Perhaps you could process this with a unix/Linux utility "Awk", before reading 
the file into R.
-Sohail

________________________________________
From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
peter dalgaard [pda...@gmail.com]
Sent: Friday, March 08, 2013 5:08 AM
To: Yao He
Cc: R help
Subject: Re: [R] How to transpose it in a fast way?

On Mar 7, 2013, at 01:18 , Yao He wrote:

> Dear all:
>
> I have a big data file of 60000 columns and 60000 rows like that:
>
> AA AC AA AA .......AT
> CC CC CT CT.......TC
> ..........................
> .........................
>
> I want to transpose it and the output is a new like that
> AA CC ............
> AC CC............
> AA CT.............
> AA CT.........
> ....................
> ....................
> AT TC.............
>
> The keypoint is  I can't read it into R by read.table() because the
> data is too large,so I try that:
> c<-file("silygenotype.txt","r")
> geno_t<-list()
> repeat{
>  line<-readLines(c,n=1)
>  if (length(line)==0)break  #end of file
>  line<-unlist(strsplit(line,"\t"))
> geno_t<-cbind(geno_t,line)
> }
> write.table(geno_t,"xxx.txt")
>
> It works but it is too slow ,how to optimize it???


As others have pointed out, that's a lot of data!

You seem to have the right idea: If you read the columns line by line there is 
nothing to transpose. A couple of points, though:

- The cbind() is a potential performance hit since it copies the list every 
time around. geno_t <- vector("list", 60000) and then
geno_t[[i]] <- <etc>

- You might use scan() instead of readLines, strsplit

- Perhaps consider the data type as you seem to be reading strings with 16 
possible values (I suspect that R already optimizes string storage to make this 
point moot, though.)

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to