Jack Sleight wrote:
wouldn't $pdf->render(); of Zend_pdf Library do the trick?? it says it
returns pdf in a string format....
No, the string that $pdf->render() returns is the binary data that can then be stored, as a file for example. It doesn't return the text in the PDF. If you can find some way to extract the text from your PDF/DOC files then that can easily be indexed, but extracting the text is the problem.

P.S. If you find one let me know, I also need to do this with some PDFs :) .
Just FYI, there are quite a few comments on the PDF functions page of the PHP manual with code snippets and links to utilities to extract text from PDF documents.

http://www.php.net/manual/en/ref.pdf.php

HTH,
Bryce Lohr

Reply via email to