Re: [fw-general] Getting Text from a PDF

2009-02-07 Thread Shaun Farrell
I used XPDF - http://www.foolabs.com/xpdf/ for indexing PDFs with Zend http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/ Shaun On Fri, Feb 6, 2009 at 9:08 AM, Matthias Buesing < matthias.bues...@mediaraum.com> wrote: > Hi Jonathan, > I found pdftohtml whic

Re: [fw-general] Getting Text from a PDF

2009-02-06 Thread Matthias Buesing
Hi Jonathan, I found pdftohtml which is exactly what I've been searching for. Thank you very much. Matthias Jonathan Maron schrieb: > Hello Matthias > > If you are running Linux, have you considered 'pdftotext'? > > http://linux.die.net/man/1/pdftotext > > If would be trivial to shell out us

Re: [fw-general] Getting Text from a PDF

2009-02-06 Thread Jonathan Maron
Hello Matthias If you are running Linux, have you considered 'pdftotext'? http://linux.die.net/man/1/pdftotext If would be trivial to shell out using exec() and convert the text that way. If you choose this route, it is very important to ensure all parameters being sent to exec() have not been

[fw-general] Getting Text from a PDF

2009-02-06 Thread Matthias Buesing
Hello, is there any way to get the Text from inside of a PDF with Zend_PDF? Or does anybody know a _free_ tool to do this? Greetings Matthias