Re: IBM OmniFind Yahoo! Edition

Andreas Neumann Wed, 13 Dec 2006 21:48:29 -0800

Thanks for the congratulations, Doug!

The credits for the Lucene side of the work really go to Michael, and
to the entire Lucene group - this community sometimes came up with
patches faster than we could ask for them.


To answer your question: How is Lucene used in this product?
- Needless to mention that we use Lucene to index and search documents.
- The documents are gathered by web and file system crawlers that we
 took from OmniFind Enterprise Edition, improved and adapted to the
 small-footprint of Yahoo! Edition.
- For analysis, we use IBM's LanguageWare text analytics packaged into
 the UIMA framework - no "vanilla" Lucene analyzers used. This part
 was a little tricky because UIMA's document processing model (analyze
 the entire document at once) differs from Lucene's, which analyzes
 each field separately.
- For search, we extended QueryParser for LanguageWare-specific handling
 of base forms, stopword. and synonyms. Oh, and we tuned the scoring a
 little.
- A lot of the work actually went into the infrastructure that puts it
 all together - configuration, administration, APIs etc.

All together, it was a thrill to work with Lucene, it made a lot of things
a whole lot easier.

- Andreas.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: IBM OmniFind Yahoo! Edition

Reply via email to