Yeah.
Zend_Search_Lucene needs text extracted from PDF document to index it. Text extraction feature is planned since first versions of Zend_Pdf and was estimated as “easy to implement”. But it’s not done up to now. The problem is in some special cases which increase implementation complexity. I mean compressed or encrypted text streams and some encoding issues. I am not sure, what is preferable, to have implementation which doesn’t work correct for all cases or don’t have it at all (in view of existing PDF to text converting solutions). With best regards, Alexander Veremyev. _____ From: Jack Sleight [mailto:[EMAIL PROTECTED] Sent: Thursday, November 01, 2007 2:11 PM To: peoplesoft Cc: fw-general@lists.zend.com Subject: Re: [fw-general] Zend_Lucene_Search for PDFs I could be wrong, but unfortunately I don't THINK it's currently possible. Search_Lucene doesn't have built in support for PDF files. If you can find another PHP class that can successfully extract all the text from a PDF file (something Zend_Pdf unfortunately can't do), then indexing that with Search_Lucene is fairly straight forward, it just getting that text out that's the problem, because PDF files are encoded, so just indexing the source of a PDF file wont work. Of course I could be wrong, hopefully someone from the MFS team can confirm/correct this? peoplesoft wrote: Please help.... its very urgent and i had high hopes of having PDF search with Zend lucene. :-(( peoplesoft wrote: -- Jack No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.15.14/1100 - Release Date: 30.10.2007 18:26 No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.15.14/1100 - Release Date: 30.10.2007 18:26