Re: MoreLikeThis for multiple documents

Grant Ingersoll Thu, 26 Jul 2007 08:23:39 -0700

I have some sample code for doing relevance feedback across multipledocuments at http://www.cnlp.org/apachecon2005

It could be modified to provide more of the MoreLikeThisfunctionality (i.e. determining important terms via tf/idf) for nowit just takes the top X terms


-Grant

On Jul 25, 2007, at 3:04 PM, Jens Grivolla wrote:

Hello,
I'm looking to extract significant terms characterizing a set ofdocuments (which in turn relate to a topic).
This basically comes down to functionality similar to determiningthe terms with the greatest offer weight (as used for blindrelevance feedback), or maximizing tf.idf (as is done inMoreLikeThis).
Is there anything like this already implemented, or do I need toiterate through all documents in the set "manually", re-tokenizeeach one (or maybe use TermVectors), and then calculate the weightfor each term?
Thanks,
   Jens

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: MoreLikeThis for multiple documents

Reply via email to