Hello:
Apart from readPDF in the tm package, you can use the pdf to text converter
command in linux, which is "pdftotext". Say "file.pdf" is your file, from R
you'd use:
system("pdftotext file.pdf -layout")
This invokes the pdftotext command from within R and creates a file called
"file.txt"
I need to do text mining on PDF files. I understand there is a readPDF
command in tm that can be used. Have read the 2008 posts on converting
PDF files to text by Tony Breyal and others.
Wondering if the procedure has been standardized in any tutorial or
otherwise? Being new to R, I was ab
Thank you, Tony.
Even in 2012, I still found your post useful.
--
View this message in context:
http://r.789695.n4.nabble.com/Reading-PDF-files-tp977248p4650374.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org ma
Copied/pasted from my earlier reply:
It's been a while since I've
done this, but if memory serves, the basic process was to download
xpdf and add it to the windows path, thus making it accessable from
within R. Two methods follow:
Method One (easiest) - using the awesome ?system command:
(1) Do
Hi:
I need to do text mining on PDF files. I understand there is a readPDF
command in tm that can be used. Have read the 2008 posts on converting
PDF files to text by Tony Breyal and others.
Wondering if the procedure has been standardized in any tutorial or
otherwise? Being new to R, I wa
Greetings Zaki,
You should really post this question on the R-help forum so that
others might benefit from any responses. It's been a while since I've
done this, but if memory serves, the basic process was to download
xpdf and add it to the windows path, thus making it accessable from
within R. Tw
6 matches
Mail list logo