Re: [Moses-support] Usage of query command with RDLM

Madori Ikeda Fri, 08 Jul 2016 21:53:34 -0700

Rico Sennrich <rico.sennrich@...> writes:

> 
> 
>     Hello Madori,
>       the query command is specific to n-gram LMs in the ARPA format (or
>       a compiled format of KenLM).
>       Here is how you can measure log probabilities with RDLM (or NPLM
>       in general):
>       1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM)
>       from the test set, with the same settings that you used for
>       training. For RDLM, the relevant script is in
>       mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py
>       2. use the testNeuralNetwork binary from NPLM:
>       nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams]
>       --model_file [your_model]
>       Note that with RDLM, there are two models, and you'll need to
>       extract a test set for each (with the '--mode' argument): one for
>       predicting words given a dependency label, one for predicting
>       dependency labels.
>       best wishes,
>       Rico
>       On 07.07.2016 02:34, IKEDA Madori wrote:
> 
>     
>       Hello,
>         
>         I'm trying to evaluate fluentness of text based on RDLM,
> 
>         and I think query command can do that like described in
>           Moses manual Sec. 2.3.4.
>         
>         The question is how can I use the query command with RDLM ?
>         RDLM is constructed and separated into two files (head /
>           label mode) in Moses (in the manual Sec. 5.13.11).
>         I don't know how to assign the files to the query command.
>         
>           Please anyone tell me the usage.
>         
>         
>         Now, I have both head and label mode files of RDLM
>         and text files containing sentences of which I want to
>           evaluate fluentness.
>         
>         Thank you.
> 
>         
>         Best,
>         Madori
>       
>       _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
>     
>     
> 
> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


Thank you Rico,

That was my misunderstanding on the query command.
Now, I still have some questions.

I got some outputs by following your guidance.
However, I think I have not reached the log probabilities
because there are two outputs (head and label modes) from a single text.
How can I get the log probability of the text ?

Here is my procedure. Is it proper ?

I prepared a TXT file including a sentence:
this is an english sentence.

I processed it through
TXT -> ConLL -> Moses XML format -> syntactic N-gram format.
The Moses XML was obtained
by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py,
and The syntactic N-gram was
by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py.
Besides, vocab files ware generated
by mosesdecoder/scripts/training/rdlm/extract_vocab.py
and used for extract_syntactic_ngrams.py.
Of course, I got two types of syntactic N-gram files: head and label modes.

Finally, I got two types of outputs in head and label modes.

First, 
$ nplm/src/testNeuralNetwork
    --test_file [syntactic_head_ngrams] --model_file [rdlm_head_model]
    --mode head --debug 2
gave me the following output:
10 10 10  6  6  6  9 11  5 44 43 51 -7.14251
10 10 51  6  6 43  9 11  5 44 43 52 -3.62643
10 51 52  6 43 43  9 11  5 44 43 53 -7.13742
51 52 53 43 43 43  9 11  5 44 43 54 -7.13615
52 53 54 43 43 43  9 11  5 44 43 50 -4.03913
53 54 50 43 43 43  9 11  5 44 43 47 -7.14106
Test log-likelihood: -36.2227

Second,
$ nplm/src/testNeuralNetwork
    --test_file [syntactic_label_ngrams] --model_file [rdlm_label_model]
    --mode label --debug 2
outputted the following:
10 10 10  6  6  6  9  9  5  5 44 -7.74169
10 10 10  6  6  6  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 51 44 43  4 -2.09066
10 10  8  6  6  4 11 51 44 43  7 -7.74169
10 10 51  6  6 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 52 44 43  4 -2.09066
10 10  8  6  6  4 11 52 44 43  7 -7.74169
10 51 52  6 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 53 44 43  4 -2.09066
10 10  8  6  6  4 11 53 44 43  7 -7.74169
51 52 53 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 54 44 43  4 -2.09066
10 10  8  6  6  4 11 54 44 43  7 -7.74169
52 53 54 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 50 44 43  4 -2.09066
10 10  8  6  6  4 11 50 44 43  7 -7.74169
53 54 50 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 47 44 43  4 -2.09066
10 10  8  6  6  4 11 47 44 43  7 -7.74169
54 50 47 43 43 43  9 11  5 44  7 -7.74169
Test log-likelihood: -120.928

How can I combine the two outputs
and get the RDLM log probability of the input sentence ?

Best regards,
Madori

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Usage of query command with RDLM

Reply via email to