[sorry for the garbled message before]

you are right. The idea is pretty obvious. It roughly corresponds to 
'Histogram pruning' in this paper:

Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase 
Table Pruning Technique. In Proceedings of the 2012 Joint Conference on 
Empirical Methods in Natural Language Processing and Computational 
Natural Language Learning (EMNLP-CoNLL), pp. 972-983.

The idea has been described in the literature before that (for instance, 
Johnson et al. (2007) only use the top 30 phrase pairs per source 
phrase), and may have been used in practice for even longer. If you read 
the paper above, you will find that histogram pruning does not improve 
translation quality on a state-of-the-art SMT system, and performs 
poorly compared to more advanced pruning techniques.

On 19.06.2015 17:49, Read, James C. wrote:
> So, all I did was filter out the less likely phrase pairs and the BLEU score 
> shot up. Was that such a stroke of genius? Was that not blindingly obvious?
>
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to