Ah OK, I misunderstood, I thought you were talking about more advanced 
pruning techniques compared to the significance method from Johnson et 
al. while you only referred to the 30-best variant.
Cheers,
Marcin

On 19.06.2015 19:35, Rico Sennrich wrote:
> Marcin Junczys-Dowmunt <junczys@...> writes:
>
>> Hi Rico,
>> since you are at it, some pointers to the more advanced pruning
>> techniques that do perform better, please :)
>>
>> On 19.06.2015 19:25, Rico Sennrich wrote:
>>> [sorry for the garbled message before]
>>>
>>> you are right. The idea is pretty obvious. It roughly corresponds to
>>> 'Histogram pruning' in this paper:
>>>
>>> Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase
>>> Table Pruning Technique. In Proceedings of the 2012 Joint Conference on
>>> Empirical Methods in Natural Language Processing and Computational
>>> Natural Language Learning (EMNLP-CoNLL), pp. 972-983.
>>>
>>> The idea has been described in the literature before that (for instance,
>>> Johnson et al. (2007) only use the top 30 phrase pairs per source
>>> phrase), and may have been used in practice for even longer. If you read
>>> the paper above, you will find that histogram pruning does not improve
>>> translation quality on a state-of-the-art SMT system, and performs
>>> poorly compared to more advanced pruning techniques.
>
> the Zens et al. (2012) paper has a nice overview. significance
> pruning and relative entropy pruning are both effective - you are not
> guaranteed improvements over the unpruned system (although Johnson (2007)
> does report improvements), but both allow you to reduce the size of your
> models substantially with little loss in quality.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to