Hi Rico, since you are at it, some pointers to the more advanced pruning techniques that do perform better, please :)
On 19.06.2015 19:25, Rico Sennrich wrote: > [sorry for the garbled message before] > > you are right. The idea is pretty obvious. It roughly corresponds to > 'Histogram pruning' in this paper: > > Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase > Table Pruning Technique. In Proceedings of the 2012 Joint Conference on > Empirical Methods in Natural Language Processing and Computational > Natural Language Learning (EMNLP-CoNLL), pp. 972-983. > > The idea has been described in the literature before that (for instance, > Johnson et al. (2007) only use the top 30 phrase pairs per source > phrase), and may have been used in practice for even longer. If you read > the paper above, you will find that histogram pruning does not improve > translation quality on a state-of-the-art SMT system, and performs > poorly compared to more advanced pruning techniques. > > On 19.06.2015 17:49, Read, James C. wrote: >> So, all I did was filter out the less likely phrase pairs and the BLEU score >> shot up. Was that such a stroke of genius? Was that not blindingly obvious? >> >> > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support