Indexing the latests MS Office documents

2010-01-03 Thread Roland Villemoes
Hi All, Anyone who knows how to index the latest MS office documents like .docx and .xlsx ? >From searching it seems like Tika only supports the earlier formats .doc and >.xls med venlig hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: mailto:r...@alpha-solutions.dk

Re: Indexing the latests MS Office documents

2010-01-03 Thread Mattmann, Chris A (388J)
Hi Roland, You probably want to send your email to tika-u...@lucene.apache.org. Best of luck! Cheers, Chris On 1/3/10 4:00 PM, "Roland Villemoes" wrote: > Hi All, > > Anyone who knows how to index the latest MS office documents like .docx and > .xlsx ? > > From searching it seems like Ti

Re: Indexing the latests MS Office documents

2010-01-04 Thread Peter Wolanin
You must have been searching old documentation - I think tika 0,3+ has support for the new MS formats. but don't take my word for it - why don't you build tika and try it? -Peter On Sun, Jan 3, 2010 at 7:00 PM, Roland Villemoes wrote: > Hi All, > > Anyone who knows how to index the latest MS o

Re: Indexing the latests MS Office documents

2010-01-05 Thread Jay Hill
The version of Tika in the 1.4 release definitely parses the most current Office formats (.docx, .pptx, etc.) and they index as expected. -Jay On Mon, Jan 4, 2010 at 6:02 PM, Peter Wolanin wrote: > You must have been searching old documentation - I think tika 0,3+ has > support for the new MS f