Re: [Moses-support] Usage of query command with RDLM

2016-07-09 Thread Madori Ikeda
Hello Rico,

Thank you so much for your help.

Yes, the model is made from a very small corpus.
I will try to calculate the log probabilities
with other RDLM files made from huge corpora.

Best regards,
Madori

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Usage of query command with RDLM

2016-07-09 Thread Rico Sennrich
Hello Madori,

your procedure looks alright. The final log probability of the sentence 
is the sum of the log probabilities of the label model and the head 
model. [although your log probabilities look very high; this is ok if 
you only trained a toy model so far].

best wishes,
Rico

On 09/07/16 04:42, Madori Ikeda wrote:
> Rico Sennrich  writes:
>
>>
>>  Hello Madori,
>>the query command is specific to n-gram LMs in the ARPA format (or
>>a compiled format of KenLM).
>>Here is how you can measure log probabilities with RDLM (or NPLM
>>in general):
>>1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM)
>>from the test set, with the same settings that you used for
>>training. For RDLM, the relevant script is in
>>mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py
>>2. use the testNeuralNetwork binary from NPLM:
>>nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams]
>>--model_file [your_model]
>>Note that with RDLM, there are two models, and you'll need to
>>extract a test set for each (with the '--mode' argument): one for
>>predicting words given a dependency label, one for predicting
>>dependency labels.
>>best wishes,
>>Rico
>>On 07.07.2016 02:34, IKEDA Madori wrote:
>>
>>  
>>Hello,
>>  
>>  I'm trying to evaluate fluentness of text based on RDLM,
>>
>>  and I think query command can do that like described in
>>Moses manual Sec. 2.3.4.
>>  
>>  The question is how can I use the query command with RDLM ?
>>  RDLM is constructed and separated into two files (head /
>>label mode) in Moses (in the manual Sec. 5.13.11).
>>  I don't know how to assign the files to the query command.
>>  
>>Please anyone tell me the usage.
>>  
>>  
>>  Now, I have both head and label mode files of RDLM
>>  and text files containing sentences of which I want to
>>evaluate fluentness.
>>  
>>  Thank you.
>>
>>  
>>  Best,
>>  Madori
>>
>>___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>  
>>  
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@...
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> Thank you Rico,
>
> That was my misunderstanding on the query command.
> Now, I still have some questions.
>
> I got some outputs by following your guidance.
> However, I think I have not reached the log probabilities
> because there are two outputs (head and label modes) from a single text.
> How can I get the log probability of the text ?
>
> Here is my procedure. Is it proper ?
>
> I prepared a TXT file including a sentence:
> this is an english sentence.
>
> I processed it through
> TXT -> ConLL -> Moses XML format -> syntactic N-gram format.
> The Moses XML was obtained
> by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py,
> and The syntactic N-gram was
> by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py.
> Besides, vocab files ware generated
> by mosesdecoder/scripts/training/rdlm/extract_vocab.py
> and used for extract_syntactic_ngrams.py.
> Of course, I got two types of syntactic N-gram files: head and label modes.
>
> Finally, I got two types of outputs in head and label modes.
>
> First,
> $ nplm/src/testNeuralNetwork
>  --test_file [syntactic_head_ngrams] --model_file [rdlm_head_model]
>  --mode head --debug 2
> gave me the following output:
> 10 10 10  6  6  6  9 11  5 44 43 51 -7.14251
> 10 10 51  6  6 43  9 11  5 44 43 52 -3.62643
> 10 51 52  6 43 43  9 11  5 44 43 53 -7.13742
> 51 52 53 43 43 43  9 11  5 44 43 54 -7.13615
> 52 53 54 43 43 43  9 11  5 44 43 50 -4.03913
> 53 54 50 43 43 43  9 11  5 44 43 47 -7.14106
> Test log-likelihood: -36.2227
>
> Second,
> $ nplm/src/testNeuralNetwork
>  --test_file [syntactic_label_ngrams] --model_file [rdlm_label_model]
>  --mode label --debug 2
> outputted the following:
> 10 10 10  6  6  6  9  9  5  5 44 -7.74169
> 10 10 10  6  6  6  9 11  5 44 43 -7.74169
> 10 10 10  6  6  6 11 51 44 43  4 -2.09066
> 10 10  8  6  6  4 11 51 44 43  7 -7.74169
> 10 10 51  6  6 43  9 11  5 44 43 -7.74169
> 10 10 10  6  6  6 11 52 44 43  4 -2.09066
> 10 10  8  6  6  4 11 52 44 43  7 -7.74169
> 10 51 52  6 43 43  9 11  5 44 43 -7.74169
> 10 10 10  6  6  6 11 53 44 43  4 -2.09066
> 10 10  8  6  6  4 11 53 44 43  7 -7.74169
> 51 52 53 43 43 43  9 11  5 44 43 -7.74169
> 10 10 10  6  6  6 11 54 44 43  4 -2.09066
> 10 10  8  6  6  4 11 54 44 43  7 -7.74169
> 52 53 54 43 43 43  9 11  5 44 43 -7.74169
> 10 10 10  6  6  6 11 50 44 43  4 -2.09066
> 10 10  8  6  6  4 11 50 44 43  

Re: [Moses-support] Usage of query command with RDLM

2016-07-08 Thread Madori Ikeda
Rico Sennrich  writes:

> 
> 
> Hello Madori,
>   the query command is specific to n-gram LMs in the ARPA format (or
>   a compiled format of KenLM).
>   Here is how you can measure log probabilities with RDLM (or NPLM
>   in general):
>   1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM)
>   from the test set, with the same settings that you used for
>   training. For RDLM, the relevant script is in
>   mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py
>   2. use the testNeuralNetwork binary from NPLM:
>   nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams]
>   --model_file [your_model]
>   Note that with RDLM, there are two models, and you'll need to
>   extract a test set for each (with the '--mode' argument): one for
>   predicting words given a dependency label, one for predicting
>   dependency labels.
>   best wishes,
>   Rico
>   On 07.07.2016 02:34, IKEDA Madori wrote:
> 
> 
>   Hello,
> 
> I'm trying to evaluate fluentness of text based on RDLM,
> 
> and I think query command can do that like described in
>   Moses manual Sec. 2.3.4.
> 
> The question is how can I use the query command with RDLM ?
> RDLM is constructed and separated into two files (head /
>   label mode) in Moses (in the manual Sec. 5.13.11).
> I don't know how to assign the files to the query command.
> 
>   Please anyone tell me the usage.
> 
> 
> Now, I have both head and label mode files of RDLM
> and text files containing sentences of which I want to
>   evaluate fluentness.
> 
> Thank you.
> 
> 
> Best,
> Madori
>   
>   ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> 
> 
> ___
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 

Thank you Rico,

That was my misunderstanding on the query command.
Now, I still have some questions.

I got some outputs by following your guidance.
However, I think I have not reached the log probabilities
because there are two outputs (head and label modes) from a single text.
How can I get the log probability of the text ?

Here is my procedure. Is it proper ?

I prepared a TXT file including a sentence:
this is an english sentence.

I processed it through
TXT -> ConLL -> Moses XML format -> syntactic N-gram format.
The Moses XML was obtained
by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py,
and The syntactic N-gram was
by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py.
Besides, vocab files ware generated
by mosesdecoder/scripts/training/rdlm/extract_vocab.py
and used for extract_syntactic_ngrams.py.
Of course, I got two types of syntactic N-gram files: head and label modes.

Finally, I got two types of outputs in head and label modes.

First, 
$ nplm/src/testNeuralNetwork
--test_file [syntactic_head_ngrams] --model_file [rdlm_head_model]
--mode head --debug 2
gave me the following output:
10 10 10  6  6  6  9 11  5 44 43 51 -7.14251
10 10 51  6  6 43  9 11  5 44 43 52 -3.62643
10 51 52  6 43 43  9 11  5 44 43 53 -7.13742
51 52 53 43 43 43  9 11  5 44 43 54 -7.13615
52 53 54 43 43 43  9 11  5 44 43 50 -4.03913
53 54 50 43 43 43  9 11  5 44 43 47 -7.14106
Test log-likelihood: -36.2227

Second,
$ nplm/src/testNeuralNetwork
--test_file [syntactic_label_ngrams] --model_file [rdlm_label_model]
--mode label --debug 2
outputted the following:
10 10 10  6  6  6  9  9  5  5 44 -7.74169
10 10 10  6  6  6  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 51 44 43  4 -2.09066
10 10  8  6  6  4 11 51 44 43  7 -7.74169
10 10 51  6  6 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 52 44 43  4 -2.09066
10 10  8  6  6  4 11 52 44 43  7 -7.74169
10 51 52  6 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 53 44 43  4 -2.09066
10 10  8  6  6  4 11 53 44 43  7 -7.74169
51 52 53 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 54 44 43  4 -2.09066
10 10  8  6  6  4 11 54 44 43  7 -7.74169
52 53 54 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 50 44 43  4 -2.09066
10 10  8  6  6  4 11 50 44 43  7 -7.74169
53 54 50 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 47 44 43  4 -2.09066
10 10  8  6  6  4 11 47 44 43  7 -7.74169
54 50 47 43 43 43  9 11  5 44  7 -7.74169
Test log-likelihood: -120.928

How can I combine the two outputs
and get the RDLM log probability of the input sentence ?

Best regards,
Madori

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Usage of query command with RDLM

2016-07-07 Thread Rico Sennrich

Hello Madori,

the query command is specific to n-gram LMs in the ARPA format (or a 
compiled format of KenLM).


Here is how you can measure log probabilities with RDLM (or NPLM in 
general):


1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM) from 
the test set, with the same settings that you used for training. For 
RDLM, the relevant script is in 
mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py


2. use the testNeuralNetwork binary from NPLM:
nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams] 
--model_file [your_model]


Note that with RDLM, there are two models, and you'll need to extract a 
test set for each (with the '--mode' argument): one for predicting words 
given a dependency label, one for predicting dependency labels.


best wishes,
Rico

On 07.07.2016 02:34, IKEDA Madori wrote:

Hello,

I'm trying to evaluate fluentness of text based on RDLM,
and I think query command can do that like described in Moses manual 
Sec. 2.3.4.


The question is how can I use the query command with RDLM ?
RDLM is constructed and separated into two files (head / label mode) 
in Moses (in the manual Sec. 5.13.11).

I don't know how to assign the files to the query command.
Please anyone tell me the usage.

Now, I have both head and label mode files of RDLM
and text files containing sentences of which I want to evaluate 
fluentness.


Thank you.

Best,
Madori


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Usage of query command with RDLM

2016-07-06 Thread IKEDA Madori
Hello,

I'm trying to evaluate fluentness of text based on RDLM,
and I think query command can do that like described in Moses manual Sec.
2.3.4.

The question is how can I use the query command with RDLM ?
RDLM is constructed and separated into two files (head / label mode) in
Moses (in the manual Sec. 5.13.11).
I don't know how to assign the files to the query command.
Please anyone tell me the usage.

Now, I have both head and label mode files of RDLM
and text files containing sentences of which I want to evaluate fluentness.

Thank you.

Best,
Madori
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support