On Mon, 15 Dec 2008, megh wrote:


Hi all,

I my c: drive I have possibly 1,000 notepad files, with .txt extension. They
are named as the dates on which they were saved i.e. 1st file name is
"Volume_4-18-2008", 2nd one is "Volume_4-21-2008", 3rd one
"Volume_4-22-2008" and so on............

Also, content of each file are in same format like :

******** content of 1st file *************
section : 1
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
section : 2
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
section : 3
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
section : 4
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------
-----       ---------      ----------    -----------

Here all files have 4-sections, just like shown here but contents within
each section (i.e. dashed line here) differs file to file.

What I have to do is I have to fetch contents of "section : 2" from each
file and then save it to a R-object, matrix of list for further analysis.

Can you ppl please tell me how to do that?

Here is the outline:

        *) use list.files() or Sys.glob() to get a list of the files

        *) write a function that takes the file name as its arg, uses
           readLines() to swallow the text and uses grep() to find the
           'section' lines. Then put the 'dashes' in between two section
           lines into a separate object (say, dash.lines). Then use

                as.matrix( read.table(con <- textConnection( dash.lines ) )
                close(con)

          to get the numeric values or maybe

                sapply( strsplit(dash.lines, "[ ]+"), as.numeric)

        *) debug this on one file


        *) use lapply  to step thru the list of file names.

See

        ?list.files
        ?Sys.glob
        ?readLines
        ?grep
        ?textConnection
        ?strsplit
        ?sapply

HTH,

Chuck



Thanks and regards,
--
View this message in context: 
http://www.nabble.com/How-to-fetch-specific-part-from-a-number-of-Text-files--tp21011017p21011017.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu               UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to