Hi, open office documents are getting indexed but when i search for the words of those documents i am not seeing the correct result.
regards, ganesh Uwe Schindler wrote: > > For converting full text to plain text for indexing look at Apache TIKA, > which has an converter for OpenDocument: http://lucene.apache.org/tika/ > > This Mailing List is *about* the development of Lucene, not about > questions > *how* to develop own code that uses Lucene. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [EMAIL PROTECTED] > >> -----Original Message----- >> From: ganesh H D [mailto:[EMAIL PROTECTED] >> Sent: Friday, November 21, 2008 1:50 PM >> To: [email protected] >> Subject: Indexing Open office documents >> >> >> Hi, >> >> I have been working on Apache Lucene from past 3 days. I tried to deploy >> the >> sample application which we get from lucene distribution. its working >> absolutely fine. It's indexing all type files like .pdf, .Xml, .java , >> .txt >> etc.....its also indexing open office documents also. but when i search >> for >> the words of open office documents, its not showing the exact result. >> later >> i come to know that open office documents are ZIP archives that contain >> XML >> files. we need to uncompress the file using Java's ZIP support, then >> parse >> meta.xml to get title etc. and content.xml to get the document's content. >> But i couldn't get much information about this issue. please help me to >> solve this issue. >> >> regards, >> ganesh >> >> -- >> View this message in context: http://www.nabble.com/Indexing-Open-office- >> documents-tp20620421p20620421.html >> Sent from the Lucene - Java Developer mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Indexing-Open-office-documents-tp20620421p20658947.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
