Indexing Open office documents

ganesh H D Fri, 21 Nov 2008 04:50:21 -0800

Hi,

I have been working on Apache Lucene from past 3 days. I tried to deploy the
sample application which we get from lucene distribution. its working
absolutely fine. It's indexing all type files like .pdf, .Xml, .java , .txt
etc.....its also indexing open office documents also. but when i search for
the words of open office documents, its not showing the exact result. later
i come to know that open office documents are ZIP archives that contain XML
files. we need to uncompress the file using Java's ZIP support, then parse
meta.xml to get title etc. and content.xml to get the document's content.
But i couldn't get much information about this issue. please help me to
solve this issue.


regards,
ganesh

-- 
View this message in context: 
http://www.nabble.com/Indexing-Open-office-documents-tp20620421p20620421.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Indexing Open office documents

Reply via email to