Hi,
I want to extract information from a number of text files in a folder. The 
files are named as : 82534.txt, 82555.txt, 8282787.txt etc.

I give below a sample of the kind of the information in the text file :
########
#(a lot of preceding text)
2008-10-01      06:30:12                2 of 3
page

#(some lines of text - varies from file to file)
sekvens    890
# lines of text
sNo     start            stop            direction        value
1        70                85                up                60.2
3        60                90                down            71.5
#########

In each of the files that I choose, I want to first go to the appropriate page 
number. This is the first line in the above text and the page number is 2 (from 
2 of 3). The date and time preceding the page number vary from file to file, 
but the next line always has the word, page.
After that, I am interested in the number following the word, sekvens. Also, 
the table underneath.

Finally, I want to collect all the data in a data frame with the following 
structure :

fileno    sekvens    sNo    start    stop    direction    value
82534    890            1        70       85    up            60.2
82534    890            3        60        90    down        71.5
82555     ..               ..        ..        ..        ..            ..

There are a number of topics involved here where I have almost no familiarity. 
First, the use of regular expressions to specify the files that I want from a 
folder. Next, how do I locate a particular section (or page) in the text file 
from the description that I am interested in? Should these files be read in 
their entirety first, or is it possible to directly go the section with the 
relevant text? Next, how do I extract the data in the form that I want? 

I have identified the following commands that would be useful for me here : 
list.files(), readLines(), strsplit().
I would appreciate some help in getting started here. I would certainly benefit 
from a few hints. I would also appreciate it if I could get some links to 
references with examples showing how similiar problems are tackled.
Thanking you,
Ravi

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to