[Moses-support] stateless or stateful?!
Hi everyone I'm gonna to add a feature function to Hiero model in Moses and I have some questions about the moses code. any response would be much appreciated :) for implementing my feature function to assign score to each hypothesis, I need: the whole source sentence the words in source sentence that are translated so far target words that are produced so far 1. my feature function should be stateless or stateful? can I implement it as a stateless function and extract needed information from this vector? std::vector m_prevHypos; 2. which of these evaluate functions should be implemented? EvaluateWhenApplied( const ChartHypothesis& /* cur_hypo */, int /* featureID - used to index the state in the previous hypotheses */, ScoreComponentCollection* accumulator) const or void SkeletonStatefulFF::EvaluateWithSourceContext(const InputType &input , const InputPath &inputPath , const TargetPhrase &targetPhrase , const StackVec *stackVec , ScoreComponentCollection &scoreBreakdown , ScoreComponentCollection *estimatedFutureScore) const 3. In hierarchical translation, is there any gap between produced target phrases? How can I get produced target phrases in the code? Thanks in advance Amir ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] extract.py commande for moses decoder
hi everyone, i have a corpus written in XML i need to convert the documents to txt format i found this commande ./extract.py Lang1 lang2 i need to change this format for the translation, i know that moses accepts the xml extension but i have several steps to do before sending the corpus to the alignment phase with giza++ you can extract any parallel files with the command python ./extract.py L1 L2 L3 where L1, L2 and L3, etc. can be any/all of the 7 languages ar de en es fr ru zh The output is stored in the text/ directory. Each file contains plain texts extracted from one XML document, one sentence per line. Paragraph boundaries are preserved by empty lines. Parallel files have the same file prefix, whereas the suffixes correspond to the languages. i installed python under ubuntu but i did know how to proceed would you help me please Or if you know any other tool that converts Xml documents to Txt thank you for help ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] KenLM's query command and LM's order
Hi, No, because this is usually a bad idea. Kneser-Ney smoothing conditions lower-order probabilities on having backed off, so a 12-gram model used as a 5-gram will provide you with a properly normalized model, but it's not the distribution you want. For more on this rant and how to fix it, see my paper: http://kheafield.com/professional/edinburgh/rest_paper.pdf Kenneth On 10/12/14 12:03, Tom Hoar wrote: > I've built some LMs with an order of 12. Is it possible to use the query > command to restrict the scoring to a lower order, for example 5, or do I > need to rebuild the LMs with an order of 5? > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] KenLM's query command and LM's order
I've built some LMs with an order of 12. Is it possible to use the query command to restrict the scoring to a lower order, for example 5, or do I need to rebuild the LMs with an order of 5? ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support