[sorry for the garbled message before] you are right. The idea is pretty obvious. It roughly corresponds to 'Histogram pruning' in this paper:
Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase Table Pruning Technique. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 972-983. The idea has been described in the literature before that (for instance, Johnson et al. (2007) only use the top 30 phrase pairs per source phrase), and may have been used in practice for even longer. If you read the paper above, you will find that histogram pruning does not improve translation quality on a state-of-the-art SMT system, and performs poorly compared to more advanced pruning techniques. On 19.06.2015 17:49, Read, James C. wrote: > So, all I did was filter out the less likely phrase pairs and the BLEU score > shot up. Was that such a stroke of genius? Was that not blindingly obvious? > > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support