Matt, Erik and I have some code for this in Lucene in Action, but David Spencer did this since the book was published:
http://www.lucenebook.com/blog/announcements/more_like_this.html Otis --- Matt Chaput <[EMAIL PROTECTED]> wrote: > Is there a simple, efficient way to compute similarity of documents > indexed with Lucene? > > My first, naive idea is to use the entire contents of one document as > a > query to the second document, and use the score as a similarity > measurement. But I think I'm probably way off base with that. > > Can any IR pros set me straight? Thanks very much. > > Matt > > > -- > Matt Chaput > Word Monkey > Side Effects Software Inc. > > "A goddamned ray of sunshine all the goddamned time" > -- Sparkle Hayter > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]