Re: [Moses-support] Usage of query command with RDLM
Hello Rico, Thank you so much for your help. Yes, the model is made from a very small corpus. I will try to calculate the log probabilities with other RDLM files made from huge corpora. Best regards, Madori ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Usage of query command with RDLM
Hello Madori, your procedure looks alright. The final log probability of the sentence is the sum of the log probabilities of the label model and the head model. [although your log probabilities look very high; this is ok if you only trained a toy model so far]. best wishes, Rico On 09/07/16 04:42, Madori Ikeda wrote: > Rico Sennrichwrites: > >> >> Hello Madori, >>the query command is specific to n-gram LMs in the ARPA format (or >>a compiled format of KenLM). >>Here is how you can measure log probabilities with RDLM (or NPLM >>in general): >>1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM) >>from the test set, with the same settings that you used for >>training. For RDLM, the relevant script is in >>mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py >>2. use the testNeuralNetwork binary from NPLM: >>nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams] >>--model_file [your_model] >>Note that with RDLM, there are two models, and you'll need to >>extract a test set for each (with the '--mode' argument): one for >>predicting words given a dependency label, one for predicting >>dependency labels. >>best wishes, >>Rico >>On 07.07.2016 02:34, IKEDA Madori wrote: >> >> >>Hello, >> >> I'm trying to evaluate fluentness of text based on RDLM, >> >> and I think query command can do that like described in >>Moses manual Sec. 2.3.4. >> >> The question is how can I use the query command with RDLM ? >> RDLM is constructed and separated into two files (head / >>label mode) in Moses (in the manual Sec. 5.13.11). >> I don't know how to assign the files to the query command. >> >>Please anyone tell me the usage. >> >> >> Now, I have both head and label mode files of RDLM >> and text files containing sentences of which I want to >>evaluate fluentness. >> >> Thank you. >> >> >> Best, >> Madori >> >>___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >> >> >> ___ >> Moses-support mailing list >> Moses-support@... >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > Thank you Rico, > > That was my misunderstanding on the query command. > Now, I still have some questions. > > I got some outputs by following your guidance. > However, I think I have not reached the log probabilities > because there are two outputs (head and label modes) from a single text. > How can I get the log probability of the text ? > > Here is my procedure. Is it proper ? > > I prepared a TXT file including a sentence: > this is an english sentence. > > I processed it through > TXT -> ConLL -> Moses XML format -> syntactic N-gram format. > The Moses XML was obtained > by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py, > and The syntactic N-gram was > by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py. > Besides, vocab files ware generated > by mosesdecoder/scripts/training/rdlm/extract_vocab.py > and used for extract_syntactic_ngrams.py. > Of course, I got two types of syntactic N-gram files: head and label modes. > > Finally, I got two types of outputs in head and label modes. > > First, > $ nplm/src/testNeuralNetwork > --test_file [syntactic_head_ngrams] --model_file [rdlm_head_model] > --mode head --debug 2 > gave me the following output: > 10 10 10 6 6 6 9 11 5 44 43 51 -7.14251 > 10 10 51 6 6 43 9 11 5 44 43 52 -3.62643 > 10 51 52 6 43 43 9 11 5 44 43 53 -7.13742 > 51 52 53 43 43 43 9 11 5 44 43 54 -7.13615 > 52 53 54 43 43 43 9 11 5 44 43 50 -4.03913 > 53 54 50 43 43 43 9 11 5 44 43 47 -7.14106 > Test log-likelihood: -36.2227 > > Second, > $ nplm/src/testNeuralNetwork > --test_file [syntactic_label_ngrams] --model_file [rdlm_label_model] > --mode label --debug 2 > outputted the following: > 10 10 10 6 6 6 9 9 5 5 44 -7.74169 > 10 10 10 6 6 6 9 11 5 44 43 -7.74169 > 10 10 10 6 6 6 11 51 44 43 4 -2.09066 > 10 10 8 6 6 4 11 51 44 43 7 -7.74169 > 10 10 51 6 6 43 9 11 5 44 43 -7.74169 > 10 10 10 6 6 6 11 52 44 43 4 -2.09066 > 10 10 8 6 6 4 11 52 44 43 7 -7.74169 > 10 51 52 6 43 43 9 11 5 44 43 -7.74169 > 10 10 10 6 6 6 11 53 44 43 4 -2.09066 > 10 10 8 6 6 4 11 53 44 43 7 -7.74169 > 51 52 53 43 43 43 9 11 5 44 43 -7.74169 > 10 10 10 6 6 6 11 54 44 43 4 -2.09066 > 10 10 8 6 6 4 11 54 44 43 7 -7.74169 > 52 53 54 43 43 43 9 11 5 44 43 -7.74169 > 10 10 10 6 6 6 11 50 44 43 4 -2.09066 > 10 10 8 6 6 4 11 50 44 43
Re: [Moses-support] Usage of query command with RDLM
Rico Sennrichwrites: > > > Hello Madori, > the query command is specific to n-gram LMs in the ARPA format (or > a compiled format of KenLM). > Here is how you can measure log probabilities with RDLM (or NPLM > in general): > 1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM) > from the test set, with the same settings that you used for > training. For RDLM, the relevant script is in > mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py > 2. use the testNeuralNetwork binary from NPLM: > nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams] > --model_file [your_model] > Note that with RDLM, there are two models, and you'll need to > extract a test set for each (with the '--mode' argument): one for > predicting words given a dependency label, one for predicting > dependency labels. > best wishes, > Rico > On 07.07.2016 02:34, IKEDA Madori wrote: > > > Hello, > > I'm trying to evaluate fluentness of text based on RDLM, > > and I think query command can do that like described in > Moses manual Sec. 2.3.4. > > The question is how can I use the query command with RDLM ? > RDLM is constructed and separated into two files (head / > label mode) in Moses (in the manual Sec. 5.13.11). > I don't know how to assign the files to the query command. > > Please anyone tell me the usage. > > > Now, I have both head and label mode files of RDLM > and text files containing sentences of which I want to > evaluate fluentness. > > Thank you. > > > Best, > Madori > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > ___ > Moses-support mailing list > Moses-support@... > http://mailman.mit.edu/mailman/listinfo/moses-support > Thank you Rico, That was my misunderstanding on the query command. Now, I still have some questions. I got some outputs by following your guidance. However, I think I have not reached the log probabilities because there are two outputs (head and label modes) from a single text. How can I get the log probability of the text ? Here is my procedure. Is it proper ? I prepared a TXT file including a sentence: this is an english sentence. I processed it through TXT -> ConLL -> Moses XML format -> syntactic N-gram format. The Moses XML was obtained by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py, and The syntactic N-gram was by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py. Besides, vocab files ware generated by mosesdecoder/scripts/training/rdlm/extract_vocab.py and used for extract_syntactic_ngrams.py. Of course, I got two types of syntactic N-gram files: head and label modes. Finally, I got two types of outputs in head and label modes. First, $ nplm/src/testNeuralNetwork --test_file [syntactic_head_ngrams] --model_file [rdlm_head_model] --mode head --debug 2 gave me the following output: 10 10 10 6 6 6 9 11 5 44 43 51 -7.14251 10 10 51 6 6 43 9 11 5 44 43 52 -3.62643 10 51 52 6 43 43 9 11 5 44 43 53 -7.13742 51 52 53 43 43 43 9 11 5 44 43 54 -7.13615 52 53 54 43 43 43 9 11 5 44 43 50 -4.03913 53 54 50 43 43 43 9 11 5 44 43 47 -7.14106 Test log-likelihood: -36.2227 Second, $ nplm/src/testNeuralNetwork --test_file [syntactic_label_ngrams] --model_file [rdlm_label_model] --mode label --debug 2 outputted the following: 10 10 10 6 6 6 9 9 5 5 44 -7.74169 10 10 10 6 6 6 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 51 44 43 4 -2.09066 10 10 8 6 6 4 11 51 44 43 7 -7.74169 10 10 51 6 6 43 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 52 44 43 4 -2.09066 10 10 8 6 6 4 11 52 44 43 7 -7.74169 10 51 52 6 43 43 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 53 44 43 4 -2.09066 10 10 8 6 6 4 11 53 44 43 7 -7.74169 51 52 53 43 43 43 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 54 44 43 4 -2.09066 10 10 8 6 6 4 11 54 44 43 7 -7.74169 52 53 54 43 43 43 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 50 44 43 4 -2.09066 10 10 8 6 6 4 11 50 44 43 7 -7.74169 53 54 50 43 43 43 9 11 5 44 43 -7.74169 10 10 10 6 6 6 11 47 44 43 4 -2.09066 10 10 8 6 6 4 11 47 44 43 7 -7.74169 54 50 47 43 43 43 9 11 5 44 7 -7.74169 Test log-likelihood: -120.928 How can I combine the two outputs and get the RDLM log probability of the input sentence ? Best regards, Madori ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Usage of query command with RDLM
Hello Madori, the query command is specific to n-gram LMs in the ARPA format (or a compiled format of KenLM). Here is how you can measure log probabilities with RDLM (or NPLM in general): 1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM) from the test set, with the same settings that you used for training. For RDLM, the relevant script is in mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py 2. use the testNeuralNetwork binary from NPLM: nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams] --model_file [your_model] Note that with RDLM, there are two models, and you'll need to extract a test set for each (with the '--mode' argument): one for predicting words given a dependency label, one for predicting dependency labels. best wishes, Rico On 07.07.2016 02:34, IKEDA Madori wrote: Hello, I'm trying to evaluate fluentness of text based on RDLM, and I think query command can do that like described in Moses manual Sec. 2.3.4. The question is how can I use the query command with RDLM ? RDLM is constructed and separated into two files (head / label mode) in Moses (in the manual Sec. 5.13.11). I don't know how to assign the files to the query command. Please anyone tell me the usage. Now, I have both head and label mode files of RDLM and text files containing sentences of which I want to evaluate fluentness. Thank you. Best, Madori ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Usage of query command with RDLM
Hello, I'm trying to evaluate fluentness of text based on RDLM, and I think query command can do that like described in Moses manual Sec. 2.3.4. The question is how can I use the query command with RDLM ? RDLM is constructed and separated into two files (head / label mode) in Moses (in the manual Sec. 5.13.11). I don't know how to assign the files to the query command. Please anyone tell me the usage. Now, I have both head and label mode files of RDLM and text files containing sentences of which I want to evaluate fluentness. Thank you. Best, Madori ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support