Erik Hatcher <[EMAIL PROTECTED]> writes: > By all means, if you have other suggestions for our site, let us know > at [EMAIL PROTECTED]
One of the things I would like to see, but which isn't either in the Lucene site, documentation, or "Lucene in Action", is a complete description of how the retrieval algorithm works. That is, how the HitCollector, Scorers, Similarity, etc all fit together. I'm involved in a project which to some degree is looking at poking deeply into this part of the Lucene code. We have a nice (non-Lucene) framework for working with more different kinds of similarity functions (beyond tf-idf) which should also be expandable to include query expansion, relevance feedback, and the like. I used to think that integrating it would be as simple as hacking in Similarity, but I'm beginning to think it might need broader changes. I could obviously hook in our whole retrieval setup by just diving for an IndexReader and doing it all by hand, but then I would have to redo the incremental search and possibly the rich query structure, which would be a lose. So anyway, I got LIA hoping for a good explanation (not a good Explanation) on this bit, but it wasn't there. There are some hints on the Lucene site, but nothing complete. If I muddle it out before anything gets contributed, I'll try to write something up, but don't expect anything too soon... Ian --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]