Re: [Moses-support] Usage of query command with RDLM

2016-07-08 Thread Madori Ikeda
Rico Sennrich  writes:

> 
> 
> Hello Madori,
>   the query command is specific to n-gram LMs in the ARPA format (or
>   a compiled format of KenLM).
>   Here is how you can measure log probabilities with RDLM (or NPLM
>   in general):
>   1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM)
>   from the test set, with the same settings that you used for
>   training. For RDLM, the relevant script is in
>   mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py
>   2. use the testNeuralNetwork binary from NPLM:
>   nplm/src/testNeuralNetwork --test_file [your_extracted_ngrams]
>   --model_file [your_model]
>   Note that with RDLM, there are two models, and you'll need to
>   extract a test set for each (with the '--mode' argument): one for
>   predicting words given a dependency label, one for predicting
>   dependency labels.
>   best wishes,
>   Rico
>   On 07.07.2016 02:34, IKEDA Madori wrote:
> 
> 
>   Hello,
> 
> I'm trying to evaluate fluentness of text based on RDLM,
> 
> and I think query command can do that like described in
>   Moses manual Sec. 2.3.4.
> 
> The question is how can I use the query command with RDLM ?
> RDLM is constructed and separated into two files (head /
>   label mode) in Moses (in the manual Sec. 5.13.11).
> I don't know how to assign the files to the query command.
> 
>   Please anyone tell me the usage.
> 
> 
> Now, I have both head and label mode files of RDLM
> and text files containing sentences of which I want to
>   evaluate fluentness.
> 
> Thank you.
> 
> 
> Best,
> Madori
>   
>   ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> 
> 
> ___
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 

Thank you Rico,

That was my misunderstanding on the query command.
Now, I still have some questions.

I got some outputs by following your guidance.
However, I think I have not reached the log probabilities
because there are two outputs (head and label modes) from a single text.
How can I get the log probability of the text ?

Here is my procedure. Is it proper ?

I prepared a TXT file including a sentence:
this is an english sentence.

I processed it through
TXT -> ConLL -> Moses XML format -> syntactic N-gram format.
The Moses XML was obtained
by mosesdecoder/scripts/training/wrapper/conll2mosesxml.py,
and The syntactic N-gram was
by mosesdecoder/scripts/training/rdlm/extract_syntactic_ngrams.py.
Besides, vocab files ware generated
by mosesdecoder/scripts/training/rdlm/extract_vocab.py
and used for extract_syntactic_ngrams.py.
Of course, I got two types of syntactic N-gram files: head and label modes.

Finally, I got two types of outputs in head and label modes.

First, 
$ nplm/src/testNeuralNetwork
--test_file [syntactic_head_ngrams] --model_file [rdlm_head_model]
--mode head --debug 2
gave me the following output:
10 10 10  6  6  6  9 11  5 44 43 51 -7.14251
10 10 51  6  6 43  9 11  5 44 43 52 -3.62643
10 51 52  6 43 43  9 11  5 44 43 53 -7.13742
51 52 53 43 43 43  9 11  5 44 43 54 -7.13615
52 53 54 43 43 43  9 11  5 44 43 50 -4.03913
53 54 50 43 43 43  9 11  5 44 43 47 -7.14106
Test log-likelihood: -36.2227

Second,
$ nplm/src/testNeuralNetwork
--test_file [syntactic_label_ngrams] --model_file [rdlm_label_model]
--mode label --debug 2
outputted the following:
10 10 10  6  6  6  9  9  5  5 44 -7.74169
10 10 10  6  6  6  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 51 44 43  4 -2.09066
10 10  8  6  6  4 11 51 44 43  7 -7.74169
10 10 51  6  6 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 52 44 43  4 -2.09066
10 10  8  6  6  4 11 52 44 43  7 -7.74169
10 51 52  6 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 53 44 43  4 -2.09066
10 10  8  6  6  4 11 53 44 43  7 -7.74169
51 52 53 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 54 44 43  4 -2.09066
10 10  8  6  6  4 11 54 44 43  7 -7.74169
52 53 54 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 50 44 43  4 -2.09066
10 10  8  6  6  4 11 50 44 43  7 -7.74169
53 54 50 43 43 43  9 11  5 44 43 -7.74169
10 10 10  6  6  6 11 47 44 43  4 -2.09066
10 10  8  6  6  4 11 47 44 43  7 -7.74169
54 50 47 43 43 43  9 11  5 44  7 -7.74169
Test log-likelihood: -120.928

How can I combine the two outputs
and get the RDLM log probability of the input sentence ?

Best regards,
Madori

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] MMSAPT issue

2016-07-08 Thread Hieu Hoang
The placeholder symbol must be aligned 1-to-1. This is enforced the the
extract program - any extracted rule which doesn't breaks this constraint
is discarded

Since PDBS doesn't use the extract program, I guess it causes the segfault

Hieu Hoang
http://www.hoang.co.uk/hieu

On 8 July 2016 at 18:15, Prashant Mathur  wrote:

> Hi All,
>
> I am trying to work with the PhraseDictionaryBitextSampling(PDBS) and I
> am running into a strange error.
>
> My partial config:
>
> *[feature]*
>
> *PhraseDictionaryBitextSampling name=TranslationModel0 num-features=4
>  path=$workdir/model/phrase-table-mmsapt.1/ input-factor=0 output-factor=0
> L1=en L2=es*
>
>
> *[weight]*
>
> *TranslationModel0= 0.2 0.2 0.2 0.2 0.2 0.2*
>
>
> The above one with default weight works (even thought the num-features=4
> and num of weights = 6). Then I optimize the weights with MIRA which works
> well. The following changes are made to the PDBS's weights:
>
> *TranslationModel0= 0.000243188868964612 0.00397991229152541
> -0.0357229544822343 0.111425061122397 -0.0183311421377486
> 0.0682370333070735*
>
>
> So, when I run moses on default weight on the following input it works
> fine:
>
> *feedback score is $num to  translation="$num" entity="50">$num*
>
>
> When I run moses with the tuned weights it fails with the following error.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Translating: feedback score is $num|25 to $num|50 Line 0: Initialize
> search took 0.001 seconds totalLine 0: Collecting options took 0.110
> seconds at moses/Manager.cpp Line 141Line 0: Search took 0.845
> secondsterminate called after throwing an instance of 'util::Exception'
> what():  moses/Manager.cpp:1810 in std::map Moses::Factor*> Moses::Manager::GetPlaceholders(const Moses::Hypothesis&,
> Moses::FactorType) const threw util::Exception because `targetPos.size() !=
> 1'.Placeholder should be aligned to 1, and only 1, wordAborted*
>
> When I change the first weight of TranslationModel0 to a higher value
> there is no such error. As in I change the weights to
>
> *TranslationModel0= 0.2 0.00397991229152541 -0.0357229544822343
> 0.111425061122397 -0.0183311421377486 0.0682370333070735*
>
> 1. So, what is the first feature in PDBS?
> 2. Is this behaviour expected if the weight is low?
> 3. Can placeholders be used with PDBS?
>
> Any clues appreciated.
>
> Thanks,
> Prashant
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] MMSAPT issue

2016-07-08 Thread Prashant Mathur
Hi All,

I am trying to work with the PhraseDictionaryBitextSampling(PDBS) and I am
running into a strange error.

My partial config:

*[feature]*

*PhraseDictionaryBitextSampling name=TranslationModel0 num-features=4
 path=$workdir/model/phrase-table-mmsapt.1/ input-factor=0 output-factor=0
L1=en L2=es*


*[weight]*

*TranslationModel0= 0.2 0.2 0.2 0.2 0.2 0.2*


The above one with default weight works (even thought the num-features=4
and num of weights = 6). Then I optimize the weights with MIRA which works
well. The following changes are made to the PDBS's weights:

*TranslationModel0= 0.000243188868964612 0.00397991229152541
-0.0357229544822343 0.111425061122397 -0.0183311421377486
0.0682370333070735*


So, when I run moses on default weight on the following input it works fine:

*feedback score is $num to $num*


When I run moses with the tuned weights it fails with the following error.














*Translating: feedback score is $num|25 to $num|50 Line 0: Initialize
search took 0.001 seconds totalLine 0: Collecting options took 0.110
seconds at moses/Manager.cpp Line 141Line 0: Search took 0.845
secondsterminate called after throwing an instance of 'util::Exception'
what():  moses/Manager.cpp:1810 in std::map Moses::Manager::GetPlaceholders(const Moses::Hypothesis&,
Moses::FactorType) const threw util::Exception because `targetPos.size() !=
1'.Placeholder should be aligned to 1, and only 1, wordAborted*

When I change the first weight of TranslationModel0 to a higher value there
is no such error. As in I change the weights to

*TranslationModel0= 0.2 0.00397991229152541 -0.0357229544822343
0.111425061122397 -0.0183311421377486 0.0682370333070735*

1. So, what is the first feature in PDBS?
2. Is this behaviour expected if the weight is low?
3. Can placeholders be used with PDBS?

Any clues appreciated.

Thanks,
Prashant
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Problem in NeuralLMWrapper

2016-07-08 Thread Paco Zamora-Martínez
I have noticed a possible problem with hashCode generation at
NeuralLMWrapper. It seems that this hash is the composition of all hashes
of the n words given to the function getValue(). However, in my
understanding of the decoding process, doing it this way the decoder will
consider as different states which are the same at LM level. As long as I
understand, this hash should be computed using the last n-1 words because
it should represent the next LM state. This next state in ngram models is
the history recieved by the following LM lookup. I have created a pull
request to github.com/moses-smt/mosesdecoder solving this issue. Let me
know if you consider correctly identified this problem and you agree with
my solution.

The pull request is here: https://github.com/moses-smt/mosesdecoder/pull/161

-- 
Pako ZM :)
https://github.com/pakozm/
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] implementing feature functions in new Moses

2016-07-08 Thread Hieu Hoang
Hieu Hoang
http://www.hoang.co.uk/hieu

On 8 July 2016 at 13:11, arefeh kazemi  wrote:

> Hi all
>
> I've implemented a feature function in the old version of Moses and it
> works fine.
> I'd like to re-implement my code in the new version of moses, but it get's
> this error:
> ./moses/FF/Aref.h:330:50: error: cannot allocate an object of abstract
> type ‘Moses::ArefState’
> ./moses/FF/Aref.h:271:7: note:   because the following virtual functions
> are pure within ‘Moses::ArefState’: In file included from
> ./moses/FF/Aref.h:5:0,from
> moses/FF/Factory.cpp:4:./moses/FF/FFState.h:15:18: note:   virtual
> size_t Moses::FFState::hash() const./moses/FF/FFState.h:16:16: note:
> virtual bool Moses::FFState::operator==(const Moses::FFState&) const
>
> I guess I should implement hash() and == functions. right?
> If so, could anyone please tell me what's the task of these functions?
>
correct. In the old version of Moses, you had to implement operator<. This
is now changed so you have to implement hash() and operator==.

If 2 hypotheses are 'the same' according to your feature function, then
   1. hash() should return the same number for both hypotheses, and
   2. operator== should return true

>
> --
> Arefeh Kazemi
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] implementing feature functions in new Moses

2016-07-08 Thread arefeh kazemi
Hi all

I've implemented a feature function in the old version of Moses and it
works fine.
I'd like to re-implement my code in the new version of moses, but it get's
this error:
./moses/FF/Aref.h:330:50: error: cannot allocate an object of abstract type
‘Moses::ArefState’
./moses/FF/Aref.h:271:7: note:   because the following virtual functions
are pure within ‘Moses::ArefState’: In file included from
./moses/FF/Aref.h:5:0,from
moses/FF/Factory.cpp:4:./moses/FF/FFState.h:15:18: note:   virtual
size_t Moses::FFState::hash() const./moses/FF/FFState.h:16:16: note:
virtual bool Moses::FFState::operator==(const Moses::FFState&) const

I guess I should implement hash() and == functions. right?
If so, could anyone please tell me what's the task of these functions?

-- 
Arefeh Kazemi
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support