Hi,
This seems like a good solution. I was concerned about the time taken
up reading one at a time. If a chuck can be read in each time, then that
should be the way for me to handle the problem.
Thanks,
Walt
________________________
Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com
_____________________________________________________
On 8/15/2010 1:06 PM, jim holtman wrote:
For efficiency of processing, look at reading in several
hundred/thousand lines at a time. One line read/write will probably
spend most of the time in the system calls to do the I/O and will take
a long time. So do something like this:
con<- file('yourInputFile', 'r')
outfile<- file('yourOutputFile', 'w')
while (length(input<- readLines(con, n=1000)> 0){
for (i in 1:length(input)){
......your one line at a time processing
}
writeLines(output, con=outfile)
}
On Sun, Aug 15, 2010 at 7:58 AM, Data Analytics Corp.
<w...@dataanalyticscorp.com> wrote:
Hi,
I have an upcoming project that will involve a large text file. I want to
1. read the file into R one line at a time
2. do some string manipulations on the line
3. write the line to another text file.
I can handle the last two parts. Scan and read.table seem to read the whole
file in at once. Since this is a very large file (several hundred thousand
lines), this is not practical. Hence the idea of reading one line at at
time. The question is, can R read one line at a time? If so, how? Any
suggestions are appreciated.
Thanks,
Walt
________________________
Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com
_____________________________________________________
--
________________________
Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
________________________
Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.