subject:"RE\: More Like This similarity tuning"

Re: More Like This similarity tuning

2015-02-04 Thread Ali Nazemian

Dear Markus, Would you please explain more about maxqt parameter and the methodology of choosing best number of terms for this value? Best regards. On Wed, Feb 4, 2015 at 2:46 PM, Markus Jelsma wrote: > Well, maxqt is easy, it is just the number of terms that compose your > query. MinTF is a s

RE: More Like This similarity tuning

2015-02-04 Thread Markus Jelsma

Well, maxqt is easy, it is just the number of terms that compose your query. MinTF is a strange parameter, rare terms have a low DF and most usually not a high TF, so i would keep it at 1. MinDF is more useful, it depends entirely on the size of your corpus. If you have a lot of user-generated