Matt,
Erik and I have some code for this in Lucene in Action, but David
Spencer did this since the book was published:
http://www.lucenebook.com/blog/announcements/more_like_this.html
Otis
Awesome awesome awesome! Thanks very much.
--
Matt Chaput
Word Monkey
Side Effects Software Inc.
"A goddamne
Otis Gospodnetic wrote:
Matt,
Erik and I have some code for this in Lucene in Action, but David
Spencer did this since the book was published:
http://www.lucenebook.com/blog/announcements/more_like_this.html
If you want an informal way of doing it you're right, just feed the
words of the source
Matt,
Erik and I have some code for this in Lucene in Action, but David
Spencer did this since the book was published:
http://www.lucenebook.com/blog/announcements/more_like_this.html
Otis
--- Matt Chaput <[EMAIL PROTECTED]> wrote:
> Is there a simple, efficient way to compute similarity of
My first, naive idea is to use the entire contents of one document as
a query to the second document,
Sorry, I meant use the entire contents of one document as a query *on
the rest of the corpus*.
--
Matt Chaput
Word Monkey
Side Effects Software Inc.
"A goddamned ray of sunshine all the goddamne
Is there a simple, efficient way to compute similarity of documents
indexed with Lucene?
My first, naive idea is to use the entire contents of one document as a
query to the second document, and use the score as a similarity
measurement. But I think I'm probably way off base with that.
Can any