Greetings Zaki, You should really post this question on the R-help forum so that others might benefit from any responses. It's been a while since I've done this, but if memory serves, the basic process was to download xpdf and add it to the windows path, thus making it accessable from within R. Two methods follow:
Method One (easiest) - using the awesome ?system command: (1) Download xpdf (whichever is the latest version): ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl4-win32.zip (2) Unzip it # system(paste("[app]", "[pdf file]"), wait = FALSE) > system(paste('"C:/Program Files/xpdf/pdftotext.exe"', '"C:/Documents and > Settings/tony/Desktop/test/r-intro.pdf"'), wait=FALSE) Method Two - if you want to use the tm package like I did last year, ?readPDF requires the following (not documented anywhere that I know of, but this is what you do): (1) Download xpdf (whichever is the latest version): ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl4-win32.zip (2) Unzip it (3) Download the Redmond utility for adding files to your windows path (free version button is in the top left of the page): http://redmondlab.googlepages.com/path (4) Unzip it (5) Open the 'Redmond Path' application. (6) Click on the green plus in the top left hand corner '+'. (7) Naviagate to the folder which contains the files: C:/../xpdf-3.02pl4-win32 (8) Add it and click Ok. Then you can can do something like: > library(tm) > my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\pdfs\\' #put your pdfs > in here > Corpus(DirSource(my.path), readerControl = list(reader=readPDF)) There are some limitations to how well the conversions work depending on the pdf file, but it was so long ago now that I'm afraid I don't remember the details. HTH. Tony Breyal 2009/12/22 <zeusu...@lmu.edu>: > Hi: > > I am very new to R. I just read through your 2008 posts on converting PDF > files to text. I have exactly the same goal. > > Has the procedure been standardized in any tutorial? I was able to follow > only part of the discussion. Any way to get a set of step by step > instructions? > > Thanks. > Zaki Eusufzai > -- Tony Breyal ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.