Hi, That's really not at all what is supposed to happen. You should get only unique entries in the n-best list with the "distinct" parameter. (Maybe less than 50 if n-best-factor is set to a low value, but there shouldn't be any duplicates.)
I cannot find any reason why the "distinct" parameter wouldn't do what it's supposed to do. But maybe I'm missing something. The relevant method should be Manager::CalcNBest() (in moses/Manager.cpp). As far as I can tell, there have been no recent modifications to it in Moses master. Please try to investigate what's going on (if you have the time). Also note that n-best-factor takes effect only if distinct is active. There's no point in setting it if distinct is inactive or malfunctioning. It would potentially help you to fill up your n-best list if you got less than n (=50) entries with the distinct parameter. Cheers, Matthias On Tue, 2015-02-24 at 21:08 +0200, Erinç Dikici wrote: > (Apparently the Gmane web interface turned my reply into garbled text, > sorry for the double posting) > > Thanks again for your quick answers. > > Yes, 32 and 2 are the counts after "sort | uniq | wc -l". The total > number > of hypotheses returned for both cases was 50. > > I removed the "distinct"s from (my local copy of) > scripts/training/mert-moses.pl (lines 1261 and 1263), and that solved > the > problem! Now I can get 32 unique hypotheses with v3.0, too. > > In fact, I am pretty sure I was able to get 50 unique hypotheses (out > of a > 50-best list) with the same configuration back in version 0.x. I hope > the > new -n-best-factor will do the trick. > > Best, > > ED > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support