On Wed, Mar 17, 2010 at 02:56:15PM +0100, Nick Wellnhofer wrote: > How does Kinosearch compute the final term weights? I had a > look at Search/Similarity.c and it seems to be Sim_tf * Sim_idf * > Sim_length_norm.
The Lucene folks have put a good amount of effort into documenting the Lucene scoring model, which current KS implements. http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Similarity.html As for where that stuff gets applied... it's kind of all over the place. > On a side note, how can I interface with Kinosearch or Lucy directly on > the C level? Is there any documentation yet? There's not a C API right yet. To get things up and running, the first thing we would need to do is put all the C header files somewhere. That's kind of tricky on Unixen because the CPAN module installation process doesn't normally touch /usr/local/include, and on Windows, it's trickier still. The easy way to handle things would be the same way we're going with ProximityScorer, glomming onto the KinoSearch build directly. But we could also try jamming stuff into /usr/local/include and worry about permissions issues and portability later. Marvin Humphrey
