Hi Sanjanasri,

1) your corpus is very small, and you may have to use more iterations of NPLM training and smaller vocabulary sizes. Just to double-check, are you tuning your systems? MERT (or PRO or MIRA) should normally ensure that adding a model doesn't make BLEU go down.

2) I'm not sure which perplexity is for which model, but lower perplexity is better, so this makes sense.

3) a perplexity of 3 is *extremely* low. Do you have overlap between your test set and your training set? This would be an unrealistic test setting, and would explain why KenLM does so much better (because backoff n-gram models are good at memorizing things).

best wishes,
Rico


On 05.10.2015 09:27, Sanjanashree Palanivel wrote:
Dear Rico,

I tried using KENLM and NPLM for three language pairs. And I came across series of questions . I am listing it one by one. It would be great if you could guide me.


1) I did testing for NPLM with different vocabulary sizes and training epochs. But, the bleu score, I gained from NPLM integrated with KENLM is smaller than the one I trained with KENLM. In all the three language pairs I get a standard difference of three.

Eg: English to Hindi (KENLM-17.43, NPLM+KENLM-14.27)
      Tamil to Hindi (KENLM-16.66,NPLM+KENLM-13.53)
       Marathi to Hindi (KENLM-29.42,NPLM+KENLM-25.76)

The sentence count is 103502. unigram count is 89919. I gave vocabulary size as 89000,89700,89850 with validation size 200,200,100 respectively and with different learning rate and epocs. However, I am getting Bleu score of NPLM and KENLM is lesser.


2)The Bleu score of the model having perplexity about 385 has higher Bleu score than the one having pp around 564 . Is this rite model. I mean the model with lower perplexity seems to give better Bleu score. Where am I doing worng.


3) I used query script for KENLM model. I found perplexity to 3.4xx. But, the Bleu score of KENLM alone in decoding phase gives Blue of 16.66 for English to HIndi MT. But, when combined with NPLM I get only 13.53.

On Sun, Sep 20, 2015 at 8:07 PM, Sanjanashree Palanivel <sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote:

    Dear Rico,

                Thanks a lot for your excellent guidance.

    On Sat, Sep 19, 2015 at 9:10 PM, Rico Sennrich
    <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote:

        Hi Sanjanasri,

        we have seen improvements in BLEU from having both KENLM and
        NPLM in our system. Things can go wrong during training though
        (e.g. a bad choice of hyperparameters (vocabulary size, number
        of training epochs)). I recommend using a development set
        during NPLM training, and comparing perplexity scores with
        those obtained from KENLM.

        maybe somebody else can help you with the phrase table
        normalization. NPLM doesn't have binarization.

        best wishes,
        Rico


        On 19/09/15 08:11, Sanjanashree Palanivel wrote:
        Dear Rico,

                    I did necessary changes and I trained language
        model succesfully. The language model of nplm gives me lesser
        BLEU score when compared to KENLM. But, when I used two
        models together accuracy is greater than the one I got in
        NPLM alone but lesser than KENLM. I am just trying to tune it
        by changing the parameters. So far the accuracy is getting
        improved but not close to KENLM accuracy.  Is that worthy to
        do because its taking quite a long time to train.

         I also tried to binarize the phrase table following this
        http://www.statmt.org/moses/?n=Advanced.RuleTables#ntoc3, and
        compilation with moses is done succesfully. But. when i run
        processPhraseTableMin -threads 3 -in train/model/phrase-table.gz
        -nscores 4 -out binarised-model/phrase-table
        I am getting segmentation fault. I dont know what is worng. Is there 
something todo with threads
        Also how to binarize nplm model

        On Fri, Sep 18, 2015 at 11:27 AM, Sanjanashree Palanivel
        <sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote:

            Dear Rico,

                         Thanks a lot. Will do the necessary changes


            On Thu, Sep 17, 2015 at 1:54 PM, Rico Sennrich
            <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote:

                Hi Sanjanasri,

                if you first compiled moses without the option
                '--with-nplm', and then add the option later, the
                build system isn't smart enough to know which files
                it needs to recompile. if you change one of the
                compile options, use the option '-a' to force
                recompilation from scratch.

                best wishes,
                Rico




                On 16/09/15 06:30, Sanjanashree Palanivel wrote:
                Dear Rico,


                I did the following steps


                    1. Installed NPLM and trained a language model
                    2. I compiled it with Moses with the command
                    ./bjam --with-nplm=path/to/nplm

                              ./bjam
                    --with-nplm=/home/sanjana/Documents/SMT/NPLM/nplm
                    Tip: install tcmalloc for faster threading. See
                    BUILD-INSTRUCTIONS.txt for more information.
                    warning: No toolsets are configured.
                    warning: Configuring default toolset "gcc".
                    warning: If the default is wrong, your build may
                    not work correctly.
                    warning: Use the "toolset=xxxxx" option to
                    override our guess.
                    warning: For more configuration options, please
                    consult
                    warning:
                    
http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
                    NOT BUILDING MOSES SERVER!
                    Performing configuration checks

                        - Shared Boost : yes (cached)
                        - Static Boost : yes (cached)
                    ...patience...
                    ...patience...
                    ...found 4823 targets...
                    SUCCESS

                    3. I added the the following lines to the
moses.ini file
                         NeuralLM factor=0 name=LM1 order=5
                        path=/path/to/nplmmodel
                        LM1= 0.5

                Then i did testing. and end up with the error


                On Tue, Sep 15, 2015 at 8:43 PM, Rico Sennrich
                <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>>
                wrote:

                    Hi Sanjanasri,

                    this error occurs when Moses was compiled
                    without the option '--with-nplm'.

                    best wishes,
                    Rico



                    On 15.09.2015 15 <tel:15.09.2015%2015>:08,
                    Sanjanashree Palanivel wrote:
                    Dear Rico,

                                I updated moses and NPLM has been
                    compiled succesfully with moses. However, when
                    I perform decoding I am getting an error.

                        Defined parameters (per moses.ini or switch):
                            config:
                        
/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/moses.ini

                        distortion-limit: 6
                            feature: UnknownWordPenalty WordPenalty
                        PhrasePenalty PhraseDictionaryMemory
                        name=TranslationModel0 num-features=4
                        
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
                        input-factor=0 output-factor=0 Distortion
                        KENLM lazyken=0 name=LM0 factor=0
                        
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
                        order=3 NeuralLM factor=0 name=LM1 order=3
                        path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt

                        input-factors: 0
                            mapping: 0 T 0
                            weight: Distortion0= 0.136328 LM0=
                        0.135599 LM1= 0.5 WordPenalty0= -0.488892
                        PhrasePenalty0= 0.0826147
                        TranslationModel0= 0.0104273 0.0663914
                        0.0254094 0.0543384 UnknownWordPenalty0= 1
                        line=UnknownWordPenalty
                        FeatureFunction: UnknownWordPenalty0 start:
                        0 end: 0
                        line=WordPenalty
                        FeatureFunction: WordPenalty0 start: 1 end: 1
                        line=PhrasePenalty
                        FeatureFunction: PhrasePenalty0 start: 2 end: 2
                        line=PhraseDictionaryMemory
                        name=TranslationModel0 num-features=4
                        
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
                        input-factor=0 output-factor=0
                        FeatureFunction: TranslationModel0 start: 3
                        end: 6
                        line=Distortion
                        FeatureFunction: Distortion0 start: 7 end: 7
                        line=KENLM lazyken=0 name=LM0 factor=0
                        
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
                        order=3
                        FeatureFunction: LM0 start: 8 end: 8
                        line=NeuralLM factor=0 name=LM1 order=3
                        path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt
                        Exception: moses/FF/Factory.cpp:349 in void
                        Moses::FeatureRegistry::Construct(const
                        string&, const string&) threw
                        UnknownFeatureException because `i ==
                        registry_.end()'.
                        Feature name NeuralLM is not registered.


                    I added following 2 lines in my moses file

                     NeuralLM factor=0 name=LM1 order=5
                    path=/path/to/nplmmodel
                    LM1= 0.5



                    On Tue, Sep 15, 2015 at 5:06 PM, Sanjanashree
                    Palanivel <sanjanash...@gmail.com
                    <mailto:sanjanash...@gmail.com>> wrote:

                        Thank you for your earnest response. I will
                        update moses and I will try

                        On Tue, Sep 15, 2015 at 4:22 PM, Rico
                        Sennrich <rico.sennr...@gmx.ch
                        <mailto:rico.sennr...@gmx.ch>> wrote:

                            Hello Sanjanasri,

                            this looks like a version mismatch
                            between Moses and NPLM. Specifically,
                            you're using an older Moses commit that
                            is only compatible with nplm 0.2 (or
                            specifically, Kenneth's fork at
                            https://github.com/kpu/nplm ).

                            If you use the latest Moses version
                            from
                            https://github.com/moses-smt/mosesdecoder
                            , and the latest nplm version from
                            https://github.com/moses-smt/nplm , it
                            should work.

                            best wishes,
                            Rico


                            On 15.09.2015 08
                            <tel:15.09.2015%2008>:24, Sanjanashree
                            Palanivel wrote:

                            Dear all,

                            I tried building language model using
                            NPLM. Llanguage model was build
                            succesfully, but, when I tried to
                            compile NPLM with Moses using "./bjam
                            --with-nplm=path/to/nplm" I am getting
                            an error. I am using boost 1.55. I am
attaching the log file for reference. I dont know where I went wrong. Any
                            help would be appreciated.


-- Thanks and regards,

                            Sanjanasri J.P


                            _______________________________________________
                            Moses-support mailing list
                            Moses-support@mit.edu
                            <mailto:Moses-support@mit.edu>
                            
http://mailman.mit.edu/mailman/listinfo/moses-support


                            _______________________________________________
                            Moses-support mailing list
                            Moses-support@mit.edu
                            <mailto:Moses-support@mit.edu>
                            
http://mailman.mit.edu/mailman/listinfo/moses-support




-- Thanks and regards,

                        Sanjanasri J.P




-- Thanks and regards,

                    Sanjanasri J.P


                    _______________________________________________
                    Moses-support mailing list
                    Moses-support@mit.edu <mailto:Moses-support@mit.edu>
                    http://mailman.mit.edu/mailman/listinfo/moses-support




-- Thanks and regards,

                Sanjanasri J.P


                _______________________________________________
                Moses-support mailing list
                Moses-support@mit.edu <mailto:Moses-support@mit.edu>
                http://mailman.mit.edu/mailman/listinfo/moses-support




-- Thanks and regards,

            Sanjanasri J.P




-- Thanks and regards,

        Sanjanasri J.P


        _______________________________________________
        Moses-support mailing list
        Moses-support@mit.edu <mailto:Moses-support@mit.edu>
        http://mailman.mit.edu/mailman/listinfo/moses-support




-- Thanks and regards,

    Sanjanasri J.P




--
Thanks and regards,

Sanjanasri J.P

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to