On Wednesday 01 February 2006 12:01, Ian Kilgore wrote:
> Owen Berry wrote:
> | Anyone know of a command line utility for extracting text from a pdf
> | file, other than the one included in xpdf (pdftotext)? pdftotext does
> | exactly what I want, but I would like to avoid pulling in the rest of
> | xpdf, if possible, as this is for a server.
> |
> | BTW, I'm using it combined with the perlfect search engine, so the text
> | does not need to be formatted nicely or anything.
> |
> | Thanks,
> | Owen
>
> You can pipe pdf2ps | ps2ascii

Or, according to ps2ascii manpage (and some quick experimentation, you can 
just "ps2ascii pdffile.pdf > pdffile.txt"

(When I just tried the pdf2ps | ps2ascii, it gave me a blank... while just 
running it through ps2ascii seems to work.)

CJK
-- 
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Reply via email to