Hi, Note that Lucene only provides an API to build a search engine you can use it how ever you want it. You can pass data to indexing in 2 forms. 1. java.lang.String 2. java.io.Reader
What Lucene recieves is any of the two objects above. Now in the case of non-text documents you need to extract the text information from the documents and either create as a text file and convert to a Reader object or creat a String object (for small files). For indexing database contents, you need to write your own APIs to get data from the database (using JDBC/EJB etc), convert the data to a String object and pass it to Lucene for indexing. Again Lucene is not responsible for getting the data from your application. It only indexed the data given it to you. Also for extracting contents from pdf & doc files(generally known as straining) I know of 2 more tools wvWare -> for word documents pdftotext(xpdf) -> for pdf documents. Google around and you will get lot of links. Hope this helps. Thanks, George --- Santosh <[EMAIL PROTECTED]> wrote: > I am recently joined into list, I didnt gone through > any previous mails, if > you have any mails or related code please forward it > to me > ----- Original Message ----- > From: "Chandan Tamrakar" <[EMAIL PROTECTED]> > To: "Lucene Users List" > <[EMAIL PROTECTED]> > Sent: Thursday, August 19, 2004 3:47 PM > Subject: Re: searchhelp > > > > For PDF you need to extract a text from pdf files > using pdfbox library > and > > for word documents u can use apache POI api's . > There are messages > > posted on the lucene list related to your > queries. About database ,i > guess > > someone must have done it . :) > > > > ----- Original Message ----- > > From: "Santosh" <[EMAIL PROTECTED]> > > To: <[EMAIL PROTECTED]> > > Sent: Thursday, August 19, 2004 3:58 PM > > Subject: searchhelp > > > > > > Hi, > > > > I am using lucene search engine for my > application. > > > > i am able to search through the text files and > htmls as specified by > lucene > > > > can you please clarify my doubts > > > > 1.can lucene search through pdfs and word > documents? if yes then how? > > > > 2.can lucene search through database ? if yes then > how? > > > > thankyou > > > > santosh > > > > > > -----------------------SOFTPRO > DISCLAIMER------------------------------ > > > > Information contained in this E-MAIL and any > attachments are > > confidential being proprietary to SOFTPRO SYSTEMS > is 'privileged' > > and 'confidential'. > > > > If you are not an intended or authorised recipient > of this E-MAIL or > > have received it in error, You are notified that > any use, copying or > > dissemination of the information contained in > this E-MAIL in any > > manner whatsoever is strictly prohibited. Please > delete it immediately > > and notify the sender by E-MAIL. > > > > In such a case reading, reproducing, printing or > further dissemination > > of this E-MAIL is strictly prohibited and may be > unlawful. > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT > that an attachment > > hereto is free from computer viruses or other > defects. > > > > The opinions expressed in this E-MAIL and any > ATTACHEMENTS may be > > those of the author and are not necessarily those > of SOFTPRO SYSTEMS. > > > ------------------------------------------------------------------------ > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > [EMAIL PROTECTED] > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]