Hi James,

Irrespective of the fact that you need to tune the weights of the
log-linear model: 

Let me provide more references in order to shed light on how well
established simple pruning techniques are in our field as well as in
related fields (namely, automatic speech recognition).

This list of references might not be what you are looking for, but maybe
other readers can benefit.


V. Steinbiss, B. Tran, H. Ney. Improvements in beam search. In Proc.
of the Int. Conf. on Spoken Language Processing (ICSLP’94), pages
2143-2146, Yokohama, Japan, Sept. 1994.
http://www.steinbiss.de/vst94d.pdf

R. Zens, F. J. Och, and H. Ney. Phrase-Based Statistical Machine
Translation. In German Conf. on Artificial Intelligence (KI), pages
18-32, Aachen, Germany, Sept. 2002.
https://www-i6.informatik.rwth-aachen.de/publications/download/434/Zens-KI-2002.pdf

Philipp Koehn. Pharaoh: a beam search decoder for phrase-based
statistical machine translation models. In Proc. of the AMTA, pages
115-124, Washington, DC, USA, Sept./Oct. 2004.
http://homepages.inf.ed.ac.uk/pkoehn/publications/pharaoh-amta2004.pdf

Robert C. Moore and Chris Quirk. Faster Beam-Search Decoding for Phrasal
Statistical Machine Translation. In Proc. of MT Summit XI, European
Association for Machine Translation, Sept. 2007.
http://research.microsoft.com/pubs/68097/mtsummit2007_beamsearch.pdf

Richard Zens and Hermann Ney. Improvements in Dynamic Programming Beam
Search for Phrase-based Statistical Machine Translation. In Proc. of the
International Workshop on Spoken Language Translation (IWSLT), Honolulu,
HI, USA, Oct. 2008.
http://www.mt-archive.info/05/IWSLT-2008-Zens.pdf


Cheers,
Matthias



On Wed, 2015-06-24 at 13:11 +0000, Read, James C wrote:
> Thank you for reading very careful the draft paper I provided a link
> to and noticing that the Johnson paper is duly cited there. Given that
> you had already noticed this I shall not proceed to explain the
> blinding obvious differences between my very simple filter and their
> filter based on Fisher's exact test.
> 
> Other than that it seems painfully clear that the point I meant to
> make has not been understood entirely. If the default behaviour
> produces BLEU scores considerably lower than merely selecting the most
> likely translation of each phrase then evidently there is something
> very wrong with the default behaviour. If we cannot agree on something
> as obvious as that then I really can't see this discussion making any
> productive progress.
> 
> James
> 
> ________________________________________
> From: moses-support-boun...@mit.edu <moses-support-boun...@mit.edu> on behalf 
> of Rico Sennrich <rico.sennr...@gmx.ch>
> Sent: Friday, June 19, 2015 8:25 PM
> To: moses-support@mit.edu
> Subject: Re: [Moses-support] Major bug found in Moses
> 
> [sorry for the garbled message before]
> 
> you are right. The idea is pretty obvious. It roughly corresponds to
> 'Histogram pruning' in this paper:
> 
> Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase
> Table Pruning Technique. In Proceedings of the 2012 Joint Conference on
> Empirical Methods in Natural Language Processing and Computational
> Natural Language Learning (EMNLP-CoNLL), pp. 972-983.
> 
> The idea has been described in the literature before that (for instance,
> Johnson et al. (2007) only use the top 30 phrase pairs per source
> phrase), and may have been used in practice for even longer. If you read
> the paper above, you will find that histogram pruning does not improve
> translation quality on a state-of-the-art SMT system, and performs
> poorly compared to more advanced pruning techniques.
> 
> On 19.06.2015 17:49, Read, James C. wrote:
> > So, all I did was filter out the less likely phrase pairs and the BLEU 
> > score shot up. Was that such a stroke of genius? Was that not blindingly 
> > obvious?
> >
> >
> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to