Hi Sanjanasri,
1) your corpus is very small, and you may have to use more iterations of
NPLM training and smaller vocabulary sizes. Just to double-check, are
you tuning your systems? MERT (or PRO or MIRA) should normally ensure
that adding a model doesn't make BLEU go down.
2) I'm not sure which perplexity is for which model, but lower
perplexity is better, so this makes sense.
3) a perplexity of 3 is *extremely* low. Do you have overlap between
your test set and your training set? This would be an unrealistic test
setting, and would explain why KenLM does so much better (because
backoff n-gram models are good at memorizing things).
best wishes,
Rico
On 05.10.2015 09:27, Sanjanashree Palanivel wrote:
Dear Rico,
I tried using KENLM and NPLM for three language
pairs. And I came across series of questions . I am listing it one by
one. It would be great if you could guide me.
1) I did testing for NPLM with different vocabulary sizes and training
epochs. But, the bleu score, I gained from NPLM integrated with KENLM
is smaller than the one I trained with KENLM. In all the three
language pairs I get a standard difference of three.
Eg: English to Hindi (KENLM-17.43, NPLM+KENLM-14.27)
Tamil to Hindi (KENLM-16.66,NPLM+KENLM-13.53)
Marathi to Hindi (KENLM-29.42,NPLM+KENLM-25.76)
The sentence count is 103502. unigram count is 89919. I gave
vocabulary size as 89000,89700,89850 with validation size 200,200,100
respectively and with different learning rate and epocs. However, I am
getting Bleu score of NPLM and KENLM is lesser.
2)The Bleu score of the model having perplexity about 385 has higher
Bleu score than the one having pp around 564 . Is this rite model. I
mean the model with lower perplexity seems to give better Bleu score.
Where am I doing worng.
3) I used query script for KENLM model. I found perplexity to 3.4xx.
But, the Bleu score of KENLM alone in decoding phase gives Blue of
16.66 for English to HIndi MT. But, when combined with NPLM I get only
13.53.
On Sun, Sep 20, 2015 at 8:07 PM, Sanjanashree Palanivel
<sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote:
Dear Rico,
Thanks a lot for your excellent guidance.
On Sat, Sep 19, 2015 at 9:10 PM, Rico Sennrich
<rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote:
Hi Sanjanasri,
we have seen improvements in BLEU from having both KENLM and
NPLM in our system. Things can go wrong during training though
(e.g. a bad choice of hyperparameters (vocabulary size, number
of training epochs)). I recommend using a development set
during NPLM training, and comparing perplexity scores with
those obtained from KENLM.
maybe somebody else can help you with the phrase table
normalization. NPLM doesn't have binarization.
best wishes,
Rico
On 19/09/15 08:11, Sanjanashree Palanivel wrote:
Dear Rico,
I did necessary changes and I trained language
model succesfully. The language model of nplm gives me lesser
BLEU score when compared to KENLM. But, when I used two
models together accuracy is greater than the one I got in
NPLM alone but lesser than KENLM. I am just trying to tune it
by changing the parameters. So far the accuracy is getting
improved but not close to KENLM accuracy. Is that worthy to
do because its taking quite a long time to train.
I also tried to binarize the phrase table following this
http://www.statmt.org/moses/?n=Advanced.RuleTables#ntoc3, and
compilation with moses is done succesfully. But. when i run
processPhraseTableMin -threads 3 -in train/model/phrase-table.gz
-nscores 4 -out binarised-model/phrase-table
I am getting segmentation fault. I dont know what is worng. Is there
something todo with threads
Also how to binarize nplm model
On Fri, Sep 18, 2015 at 11:27 AM, Sanjanashree Palanivel
<sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote:
Dear Rico,
Thanks a lot. Will do the necessary changes
On Thu, Sep 17, 2015 at 1:54 PM, Rico Sennrich
<rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote:
Hi Sanjanasri,
if you first compiled moses without the option
'--with-nplm', and then add the option later, the
build system isn't smart enough to know which files
it needs to recompile. if you change one of the
compile options, use the option '-a' to force
recompilation from scratch.
best wishes,
Rico
On 16/09/15 06:30, Sanjanashree Palanivel wrote:
Dear Rico,
I did the following steps
1. Installed NPLM and trained a language model
2. I compiled it with Moses with the command
./bjam --with-nplm=path/to/nplm
./bjam
--with-nplm=/home/sanjana/Documents/SMT/NPLM/nplm
Tip: install tcmalloc for faster threading. See
BUILD-INSTRUCTIONS.txt for more information.
warning: No toolsets are configured.
warning: Configuring default toolset "gcc".
warning: If the default is wrong, your build may
not work correctly.
warning: Use the "toolset=xxxxx" option to
override our guess.
warning: For more configuration options, please
consult
warning:
http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
NOT BUILDING MOSES SERVER!
Performing configuration checks
- Shared Boost : yes (cached)
- Static Boost : yes (cached)
...patience...
...patience...
...found 4823 targets...
SUCCESS
3. I added the the following lines to the
moses.ini file
NeuralLM factor=0 name=LM1 order=5
path=/path/to/nplmmodel
LM1= 0.5
Then i did testing. and end up with the error
On Tue, Sep 15, 2015 at 8:43 PM, Rico Sennrich
<rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>>
wrote:
Hi Sanjanasri,
this error occurs when Moses was compiled
without the option '--with-nplm'.
best wishes,
Rico
On 15.09.2015 15 <tel:15.09.2015%2015>:08,
Sanjanashree Palanivel wrote:
Dear Rico,
I updated moses and NPLM has been
compiled succesfully with moses. However, when
I perform decoding I am getting an error.
Defined parameters (per moses.ini or switch):
config:
/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/moses.ini
distortion-limit: 6
feature: UnknownWordPenalty WordPenalty
PhrasePenalty PhraseDictionaryMemory
name=TranslationModel0 num-features=4
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
input-factor=0 output-factor=0 Distortion
KENLM lazyken=0 name=LM0 factor=0
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
order=3 NeuralLM factor=0 name=LM1 order=3
path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt
input-factors: 0
mapping: 0 T 0
weight: Distortion0= 0.136328 LM0=
0.135599 LM1= 0.5 WordPenalty0= -0.488892
PhrasePenalty0= 0.0826147
TranslationModel0= 0.0104273 0.0663914
0.0254094 0.0543384 UnknownWordPenalty0= 1
line=UnknownWordPenalty
FeatureFunction: UnknownWordPenalty0 start:
0 end: 0
line=WordPenalty
FeatureFunction: WordPenalty0 start: 1 end: 1
line=PhrasePenalty
FeatureFunction: PhrasePenalty0 start: 2 end: 2
line=PhraseDictionaryMemory
name=TranslationModel0 num-features=4
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
input-factor=0 output-factor=0
FeatureFunction: TranslationModel0 start: 3
end: 6
line=Distortion
FeatureFunction: Distortion0 start: 7 end: 7
line=KENLM lazyken=0 name=LM0 factor=0
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
order=3
FeatureFunction: LM0 start: 8 end: 8
line=NeuralLM factor=0 name=LM1 order=3
path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt
Exception: moses/FF/Factory.cpp:349 in void
Moses::FeatureRegistry::Construct(const
string&, const string&) threw
UnknownFeatureException because `i ==
registry_.end()'.
Feature name NeuralLM is not registered.
I added following 2 lines in my moses file
NeuralLM factor=0 name=LM1 order=5
path=/path/to/nplmmodel
LM1= 0.5
On Tue, Sep 15, 2015 at 5:06 PM, Sanjanashree
Palanivel <sanjanash...@gmail.com
<mailto:sanjanash...@gmail.com>> wrote:
Thank you for your earnest response. I will
update moses and I will try
On Tue, Sep 15, 2015 at 4:22 PM, Rico
Sennrich <rico.sennr...@gmx.ch
<mailto:rico.sennr...@gmx.ch>> wrote:
Hello Sanjanasri,
this looks like a version mismatch
between Moses and NPLM. Specifically,
you're using an older Moses commit that
is only compatible with nplm 0.2 (or
specifically, Kenneth's fork at
https://github.com/kpu/nplm ).
If you use the latest Moses version
from
https://github.com/moses-smt/mosesdecoder
, and the latest nplm version from
https://github.com/moses-smt/nplm , it
should work.
best wishes,
Rico
On 15.09.2015 08
<tel:15.09.2015%2008>:24, Sanjanashree
Palanivel wrote:
Dear all,
I tried building language model using
NPLM. Llanguage model was build
succesfully, but, when I tried to
compile NPLM with Moses using "./bjam
--with-nplm=path/to/nplm" I am getting
an error. I am using boost 1.55. I am
attaching the log file for reference.
I dont know where I went wrong. Any
help would be appreciated.
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support