I think that Nutch would crawl and search all these 3 types. Not sure that Nutch would provide the framework you seem to look for, but perhaps it is worth to take a look - http://lucene.apache.org/nutch/
"James liu" <[EMAIL PROTECTED]> wrote on 05/09/2006 23:10:16: > i wanna find frame which can index xml,word,excel,pdf,,,not one. > > i just wanna know who know the frame like what i wanna. > > > 2006/9/6, yueyu lin <[EMAIL PROTECTED]>: > > > > First, Lucene is just a index toolkit, you have to USE it to implement > > your > > application. > > > > If you want to index something, you must have knowledge how to extract > > information from them and what kind of keys they need to be set. > > > > Then you can do what you want to. > > On 9/5/06, James liu <[EMAIL PROTECTED]> wrote: > > > > > > i wanna find frame which can index xml,word,excel,pdf,,,not one. > > > > > > > > > 2006/9/6, Doron Cohen <[EMAIL PROTECTED]>: > > > > > > > > Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a > > few > > > > entries just for this: > > > > > > > > How can I index HTML documents? > > > > How can I index XML documents? > > > > How can I index OpenOffice.org files? > > > > How can I index MS-Word documents? > > > > How can I index MS-Excel documents? > > > > How can I index MS-Powerpoint documents? > > > > How can I index Email (from MS-Exchange or another IMAP server) ? > > > > How can I index RTF documents? > > > > How can I index PDF documents? > > > > How can I index JSP files? > > > > > > > > > > > > "James liu" <[EMAIL PROTECTED]> wrote on 05/09/2006 19:14:24: > > > > > > > > > i find lius many question ,,,,so i wanna give up and find new. > > > > > > > > > > who recommend ? > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]