Gascoigne Thomas wrote:

> I need to turn MS Word documents into plain text so that they can be
> indexed for searching purposes. Would the Open Office UNO api provide
> 
> a relatively straight forward and painless way to do this. I basically
> need to read word docs in and get a java String representation of the
> doc out. Any advise greatly appreciated, Thomas

You can use the OOo API to load the Word document into an OOo text
document. The API of this documents lets you travel through all of its
text paragraphs, each of them can be asked to give you the whole text
content as a single string.

Other text content inside the document (e.g. text in shapes) can be
accessed as well.

Best regards,
Mathias

-- 
Mathias Bauer - OpenOffice.org Application Framework Project Lead
Please reply to the list only, [EMAIL PROTECTED] is a spam sink.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to