[Moses-support] stateless or stateful?!

2014-10-12 Thread amir haghighi
Hi everyone

I'm gonna to add a feature function to Hiero model in Moses and I have some
questions about the moses code. any response would be much appreciated :)

for implementing my feature function to assign score to each hypothesis, I
need: the whole source sentence
the words in source sentence that are translated so far
target words that are produced so far

1. my feature function should be stateless or stateful? can I implement it
as a stateless function and extract needed information from this vector?
std::vector m_prevHypos;

2. which of these evaluate functions should be implemented?
EvaluateWhenApplied(  const ChartHypothesis& /* cur_hypo */,  int /*
featureID - used to index the state in the previous hypotheses */,
ScoreComponentCollection* accumulator) const  or
void SkeletonStatefulFF::EvaluateWithSourceContext(const InputType &input  ,
const InputPath &inputPath  , const TargetPhrase &targetPhrase  , const
StackVec *stackVec  , ScoreComponentCollection &scoreBreakdown  ,
ScoreComponentCollection *estimatedFutureScore) const
3. In hierarchical translation, is there any gap between produced target
phrases? How can I get produced target phrases in the code?


 Thanks in advance
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] extract.py commande for moses decoder

2014-10-12 Thread emna hkiri
hi everyone,
i have a corpus written in XML i need to convert the documents to txt
format
i found this  commande
./extract.py Lang1 lang2
i need to change this format for the translation, i know that moses accepts
the xml extension but i have several steps to do before sending the corpus
to the alignment phase with giza++

you can extract any parallel files with the command python ./extract.py L1
L2 L3  where L1, L2 and L3, etc. can be any/all of the 7 languages ar
de en es fr ru zh The output is stored in the text/ directory. Each file
contains plain texts extracted from one XML document, one sentence per
line. Paragraph boundaries are preserved by empty lines. Parallel files
have the same file prefix, whereas the suffixes correspond to the
languages.

i installed python under ubuntu but i did know how to proceed would you
help me please
Or if you know any other tool that converts Xml documents to Txt thank you
for help
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] KenLM's query command and LM's order

2014-10-12 Thread Kenneth Heafield
Hi,

No, because this is usually a bad idea.  Kneser-Ney smoothing
conditions lower-order probabilities on having backed off, so a 12-gram
model used as a 5-gram will provide you with a properly normalized
model, but it's not the distribution you want.

For more on this rant and how to fix it, see my paper:

http://kheafield.com/professional/edinburgh/rest_paper.pdf

Kenneth

On 10/12/14 12:03, Tom Hoar wrote:
> I've built some LMs with an order of 12. Is it possible to use the query 
> command to restrict the scoring to a lower order, for example 5, or do I 
> need to rebuild the LMs with an order of 5?
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] KenLM's query command and LM's order

2014-10-12 Thread Tom Hoar
I've built some LMs with an order of 12. Is it possible to use the query 
command to restrict the scoring to a lower order, for example 5, or do I 
need to rebuild the LMs with an order of 5?
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support