Marcin Junczys-Dowmunt <junczys@...> writes:

> 
> Hi Rico,
> since you are at it, some pointers to the more advanced pruning 
> techniques that do perform better, please :)
> 
> On 19.06.2015 19:25, Rico Sennrich wrote:
> > [sorry for the garbled message before]
> >
> > you are right. The idea is pretty obvious. It roughly corresponds to
> > 'Histogram pruning' in this paper:
> >
> > Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase
> > Table Pruning Technique. In Proceedings of the 2012 Joint Conference on
> > Empirical Methods in Natural Language Processing and Computational
> > Natural Language Learning (EMNLP-CoNLL), pp. 972-983.
> >
> > The idea has been described in the literature before that (for instance,
> > Johnson et al. (2007) only use the top 30 phrase pairs per source
> > phrase), and may have been used in practice for even longer. If you read
> > the paper above, you will find that histogram pruning does not improve
> > translation quality on a state-of-the-art SMT system, and performs
> > poorly compared to more advanced pruning techniques.


the Zens et al. (2012) paper has a nice overview. significance
pruning and relative entropy pruning are both effective - you are not
guaranteed improvements over the unpruned system (although Johnson (2007)
does report improvements), but both allow you to reduce the size of your
models substantially with little loss in quality.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to