As far as I remember, the pdfbox release includes some existing code to index pdfs with lucene, based upon the demo created for lucene 1.3. In fact, I think the code only works for lucene 1,3 - something to do with a change from arrays to vectors in lucene 1.4. I may be wrong though.
http://www.csh.rit.edu/~ben/projects/pdfbox/javadoc/org/pdfbox/searchengine/lucene/package-summary.html > thanks everybody, > > but i didnt got any code or any real help in this links > any body has performed previously this search?if yes then please send me the > code, or tell me the what code I have to add to my present lucene > ----- Original Message ----- > From: "David Townsend" <[EMAIL PROTECTED]> > To: "Lucene Users List" <[EMAIL PROTECTED]> > Sent: Thursday, August 19, 2004 4:17 PM > Subject: RE: searchhelp > > > JGURU FAQ > http://www.jguru.com/faq/Lucene > > OFFICIAL FAQ > http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi > > MAIL ARCHIVE > http://www.mail-archive.com/[EMAIL PROTECTED]/ > > hope this helps. > > > -----Original Message----- > From: Santosh [mailto:[EMAIL PROTECTED] > Sent: 19 August 2004 11:25 > To: Lucene Users List > Subject: Re: searchhelp > > > I am recently joined into list, I didnt gone through any previous mails, if > you have any mails or related code please forward it to me > ----- Original Message ----- > From: "Chandan Tamrakar" <[EMAIL PROTECTED]> > To: "Lucene Users List" <[EMAIL PROTECTED]> > Sent: Thursday, August 19, 2004 3:47 PM > Subject: Re: searchhelp > > > > For PDF you need to extract a text from pdf files using pdfbox library > and > > for word documents u can use apache POI api's . There are messages > > posted on the lucene list related to your queries. About database ,i > guess > > someone must have done it . :) > > > > ----- Original Message ----- > > From: "Santosh" <[EMAIL PROTECTED]> > > To: <[EMAIL PROTECTED]> > > Sent: Thursday, August 19, 2004 3:58 PM > > Subject: searchhelp > > > > > > Hi, > > > > I am using lucene search engine for my application. > > > > i am able to search through the text files and htmls as specified by > lucene > > > > can you please clarify my doubts > > > > 1.can lucene search through pdfs and word documents? if yes then how? > > > > 2.can lucene search through database ? if yes then how? > > > > thankyou > > > > santosh > > > > > > -----------------------SOFTPRO DISCLAIMER------------------------------ > > > > Information contained in this E-MAIL and any attachments are > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > and 'confidential'. > > > > If you are not an intended or authorised recipient of this E-MAIL or > > have received it in error, You are notified that any use, copying or > > dissemination of the information contained in this E-MAIL in any > > manner whatsoever is strictly prohibited. Please delete it immediately > > and notify the sender by E-MAIL. > > > > In such a case reading, reproducing, printing or further dissemination > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > hereto is free from computer viruses or other defects. > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > those of the author and are not necessarily those of SOFTPRO SYSTEMS. > > ------------------------------------------------------------------------ > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]