Have a look at Aperture: http://aperture.sourceforge.net/
It provides components for crawling and text and metadata extraction. It's still in alpha stage though. The development code in CVS has already improved a lot over the last official alpha release.

Chris
--

James liu wrote:
i wanna find frame which can index xml,word,excel,pdf,,,not one.


2006/9/6, Doron Cohen <[EMAIL PROTECTED]>:

Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few
entries just for this:

  How can I index HTML documents?
  How can I index XML documents?
  How can I index OpenOffice.org files?
  How can I index MS-Word documents?
  How can I index MS-Excel documents?
  How can I index MS-Powerpoint documents?
  How can I index Email (from MS-Exchange or another IMAP server) ?
  How can I index RTF documents?
  How can I index PDF documents?
  How can I index JSP files?


"James liu" <[EMAIL PROTECTED]> wrote on 05/09/2006 19:14:24:

> i find lius many question ,,,,so i wanna give up and find new.
>
> who recommend ?


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





Met vriendelijke groet,

Christiaan Fluit
--
Aduna - Guided Exploration
www.aduna-software.com

Prinses Julianaplein 14-b
3817 CS Amersfoort
The Netherlands
+31-33-4659987 (office)

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to