Re: Filters for Openoffice File Indexing available (Java)

2004-11-10 Thread Joachim Arrasz
Hi Daniel,
I don't know of any existing solutions, but it's not so difficult to write 
one: Extract the ZIP file using Java's built-in ZIP classes and parse 
content.xml and meta.xml. I'm not sure if whitespace issues might become 
tricky, e.g. two paragraphs could be in the file as 
pone/pptwo/p, but for indexing a whitespace needs to be inserted 
between them (p was just an example, I don't know what OpenOffice.org 
actually uses).
 

that seems to be not so hard, but i never have developed something like 
that, so i think i need a tutorial doing this. Why should i parse 
meta.xml? I thaught content.xml should be enough.

Thanks a lot
Bye Achim
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Filters for Openoffice File Indexing available (Java)

2004-11-08 Thread Joachim Arrasz
Hello List.
we have written an application which includes OpenOffice Integration 
into an OpenSource CMS (OpenCms).

For this CMS there is a Lucene Integration available under sourceforge.
So now we are looking for search and index Filters for Lucene, that 
were able to integrate out OpenOffice Files also into search result.

Is there any project or code available for doing this, or must we write 
everything by ourself? Do anybody know good beginner Tutorials for doing 
things like this?

Best Regards
Joachim Arrasz
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]