Hi Lane, Well, you can find excellent descriptions of phrase-based decoding algorithms in the literature, though possibly not all details of this specific implementation.
I like this description: R. Zens, and H. Ney. Improvements in Dynamic Programming Beam Search for Phrase-based Statistical Machine Translation. In International Workshop on Spoken Language Translation (IWSLT), pages 195-205, Honolulu, HI, USA, October 2008. http://www.hltpr.rwth-aachen.de/publications/download/618/Zens-IWSLT-2008.pdf It's what's implemented in Jane, RWTH's open source statistical machine translation toolkit. J. Wuebker, M. Huck, S. Peitz, M. Nuhn, M. Freitag, J. Peter, S. Mansour, and H. Ney. Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation. In International Conference on Computational Linguistics (COLING), pages 483-491, Mumbai, India, December 2012. http://www.hltpr.rwth-aachen.de/publications/download/830/Wuebker-COLING-2012.pdf However, I believe that the distinction of coverage hypotheses and lexical hypotheses is a unique property of the RWTH systems. The formalization in the Zens & Ney paper is very nicely done. With hard distortion limits or coverage-based reordering constraints, you may need a few more steps in the algorithm. E.g., if you have a hard distortion limit, you will probably want to avoid leaving a gap and then extending your sequence in a way that puts your current position further away from the gap than your maximum jump width. Other people should know more about how exactly Moses' phrase-based decoder is dealing with this. I can recommend Richard Zens' PhD thesis as well. http://www.hltpr.rwth-aachen.de/publications/download/562/Zens--2008.pdf I also remember that the following publication from Microsoft Research is pretty helpful: Robert C. Moore and Chris Quirk, Faster Beam-Search Decoding for Phrasal Statistical Machine Translation, in Proceedings of MT Summit XI, European Association for Machine Translation, September 2007. http://research.microsoft.com/pubs/68097/mtsummit2007_beamsearch.pdf Cheers, Matthias On Tue, 2015-12-15 at 22:33 +0000, Hieu Hoang wrote: > I've been looking at this and it is surprisingly complicated. I think > the code is designed to predetermine if extending a hypothesis will > lead it down a path that won't ever be completed. > > > Don't know any slide that explains the reasoning, Philipp Koehn > explained it to me once and it seems pretty reasonable. > > > > I wouldn't mind seeing this code cleaned up a bit and abstracted and > formalised. I've made a start with the cleanup in my new decoder > > https://github.com/moses-smt/mosesdecoder/blob/perf_moses2/contrib/other-builds/moses2/Search/Search.cpp#L36 > Search::CanExtend() > > > There was an Aachen paper from years ago comparing different > distortion limit heuristics - can't remember the authors or title. > Maybe someone know more > > > > > > Hieu Hoang > http://www.hoang.co.uk/hieu > > > On 15 December 2015 at 20:59, Lane Schwartz <dowob...@gmail.com> > wrote: > Hey all, > > > So the SearchNormal::ProcessOneHypothesis() method in > SearchNormal.cpp is responsible for taking an existing > hypothesis, creating all legal new extension hypotheses, and > adding those new hypotheses to the appropriate decoder > stacks. > > > First off, the method is actually reasonably well commented, > so kudos to whoever did that. :) > > > That said, does anyone happen to have any slides that actually > walk through this process, specifically slides that take into > account the interaction with the distortion limit? That > interaction is where most of the complexity of this method > comes from. I don't know about others, but even having a > pretty good notion of what's going on here, the discussion of > "the closest thing to the left" is still a bit opaque. > > > Anyway, if anyone knows of a good set of slides, or even a > good description in a paper, of what's going on here, I'd > appreciate any pointers. > > > Thanks, > Lane > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support