Re: [Moses-support] CreateProbingPT2 exception

2017-05-02 Thread Nikolay Bogoychev
Hey Mike,

Is it possible for you to make the phrase table available
/home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/pt.txt.gz
publically so we can try to reproduce the problem?

Cheers,

Nick

On Tue, May 2, 2017 at 6:06 AM, Mike Ladwig  wrote:
> Got an exception creating a PT2 with yesterday's master:
>
> Binarize phrase and reordering model in probing table format:
> /home/mike/stelae5/mosesdecoder/scripts/generic/binarize4moses2.perl
> --phrase-table=/home/mike/stelae-projects/de-en/phrasemodel/model/phrase-table.gz
> --lex-ro=/home/mike/stelae-projects/de-en/phrasemodel/model/reordering-table.wbe-msd-bidirectional-fe.gz
> --output-dir=/home/mike/stelae-projects/de-en/phrasemodel/PT2
> --num-lex-scores=6
> Executing: gzip -dc
> /home/mike/stelae-projects/de-en/phrasemodel/model/phrase-table.gz |
> /home/mike/stelae5/mosesdecoder/scripts/generic/../../contrib/sigtest-filter/filter-pt
> -n 0 | gzip -c >
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/pt.gz
> sh:
> /home/mike/stelae5/mosesdecoder/scripts/generic/../../contrib/sigtest-filter/filter-pt:
> No such file or directory
> Executing:
> /home/mike/stelae5/mosesdecoder/scripts/generic/../../bin/processLexicalTableMin
> -in
> /home/mike/stelae-projects/de-en/phrasemodel/model/reordering-table.wbe-msd-bidirectional-fe.gz
> -out /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/lex-ro -T .
> -threads all
> Used options:
> Text reordering table will be read from:
> /home/mike/stelae-projects/de-en/phrasemodel/model/reordering-table.wbe-msd-bidirectional-fe.gz
> Output reordering table will be written to:
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/lex-ro.minlexr
> Step size for source landmark phrases: 2^10=1024
> Phrase fingerprint size: 16 bits / P(fp)=1.52588e-05
> Single Huffman code set for score components: no
> Using score quantization: no
> Running with 24 threads
>
> Pass 1/2: Creating phrase index + Counting scores
> ..[500]
> ..[1000]
> ..[1500]
> ..[2000]
> ..[2500]
> ..[3000]
> ..[3500]
> ..[4000]
> ..[4500]
> ..[5000]
> ..[5500]
> ..[6000]
> ..[6500]
> ..[7000]
> ..[7500]
> 
>
> Intermezzo: Calculating Huffman code sets
> Creating Huffman codes for 32003 scores
> Creating Huffman codes for 16732 scores
> Creating Huffman codes for 31335 scores
> Creating Huffman codes for 32076 scores
> Creating Huffman codes for 15096 scores
> Creating Huffman codes for 31659 scores
>
> Pass 2/2: Compressing scores
> ..[500]
> ..[1000]
> ..[1500]
> ..[2000]
> ..[2500]
> ..[3000]
> ..[3500]
> ..[4000]
> ..[4500]
> ..[5000]
> ..[5500]
> ..[6000]
> ..[6500]
> ..[7000]
> ..[7500]
> 
>
> Saving to
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/lex-ro.minlexr
> Done
> Executing:
> /home/mike/stelae5/mosesdecoder/scripts/generic/../../bin/addLexROtoPT
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/pt.gz
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/lex-ro.minlexr  |
> gzip -c >
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/pt.withLexRO.gz
> Executing: ln -s pt.withLexRO.gz
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.15419/pt.txt.gz
> Executing:
> /home/mike/stelae5/mosesdecoder/scripts/generic/../../bin/CreateProbingPT2
> --num-scores 4 --log-prob --input-pt
> /home/mike/stelae-projects/de-en/phrasemodel/tmp.1

Re: [Moses-support] Rebuilding moses binary only

2017-03-30 Thread Nikolay Bogoychev
I've been asking this same question since late 2013..?

On Thu, Mar 30, 2017 at 10:30 PM, Marcin Junczys-Dowmunt
 wrote:
> Hi list,
>
> is there a way to tell bjam to only rebuild the moses binary and not the
> 84 unrelated targets that just happen to be rebuilt out of solidarity?
>
> Thanks,
>
> Marcin
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] BilingualNPLM: A target phrase with no alignments detected!

2016-03-19 Thread Nikolay Bogoychev
Hey Jeremy,

The error you should get should be:   "A target phrase with no alignments
detected! " << targetPhrase << "Check if there is something wrong with your
phrase table."); which should also include the targetPhrase in question.
My guess is that you use PhraseDictionaryCompact as your phrase table which
in some cases is known to produce target phrases without alignments. My
first suggestion would be check your phrase table and see if that target
phrase has alignments.  If the target phrase has alignments then it is
probably lost during the the phrase table binarization. I would suggest
that you use a different phrase table or set EMS to use a single thread,
which should help avoid the problem.

Cheers,

Nick

On Wed, Mar 16, 2016 at 2:49 PM, Jeremy Gwinnup  wrote:

> Hi,
>
> I’m attempting to use a BilingualNPLM (trained per the recipe on the moses
> website) in decoding - I get ‘A target phrase with no alignments detected!’
> error. All data used in training the model were products of a training run
> in EMS. I’m using the recommended NPLM settings with the exception of
> setting the input embedding to 750.
>
> Any ideas as to if I need to train differently?
>
> Thanks!
> -Jeremy
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-21 Thread Nikolay Bogoychev
Hey Jian,

I have encountered this problem with nplm myself and couldn't really find a
solution that works every time.

Basically what happens is that there is a token that occurs very frequently
on the same position and it's weights become huge and eventually not a
number which propagates to the rest of the data. This usually happens with
the beginning of sentence token especially if your source and target size
contexts are big. One thing you could do is to decrease the source and
target size context (doesn't always work). Another thing you could do is to
lower the learning rate (always works, but you might need to set it quite
low like 0.25)

The proper solution to this according to Ashish Vasvani who is the creator
of nplm is to use gradient clipping which is commented out in his code. You
should contact him because this is a nplm issue.

Cheers,

Nick

On Sat, Sep 19, 2015 at 8:58 PM, jian zhang  wrote:

> Hi all,
>
> I got
>
> Epoch 
> Current learning rate: 1
> Training minibatches: Validation log-likelihood: -nan
>perplexity: nan
>
> during bilingual neural lm training.
>
> I use command:
> /home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork --train_file
> work_dir/blm/train.numberized --num_epochs 30 --model_prefix
> work_dir/blm/train.10k.model.nplm --learning_rate 1 --minibatch_size 1000
> --num_noise_samples 100 --num_hidden 2 --input_embedding_dimension 512
> --output_embedding_dimension 192 --num_threads 6 --loss_function log
> --activation_function tanh --validation_file work_dir/blm/valid.numberized
> --validation_minibatch_size 10
>
> where train.numberized and valid.numberized files are splitted from the
> file generated by
> script ${moses}/scripts/training/bilingual-lm/extract_training.py.
>
> Training/Validation numbers are:
> Number of training instances: 4128195
> Number of validation instances: 217274
>
>
> Thanks,
>
> Jian
>
>
> Jian Zhang
> Centre for Next Generation Localisation (CNGL)
> 
> Dublin City University 
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] nplm ngram total order in ems

2015-08-01 Thread Nikolay Bogoychev
Hey John,

This is correct. So imagine the situation of order 5 and source window 4:
 
 is aligned to  and your source window is 4: 4 tokens before and
after s0, which results in a 14gram in total.

Cheers,

Nick

On Sat, Aug 1, 2015 at 4:30 PM, John Joseph Morgan <
johnjosephmor...@gmail.com> wrote:

> I’m trying to run the toy bilingualnplm example with ems.
> The ngram order gets computed in experiment.perl on line 1868.
> The formula is:
> $order + 2 * $source_window + 1
> If $order is 5 and $source_window is 4 this formula gives 14.
> Is this correct?
> It doesn't seem right.
>
> John
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Parallelizer multi core

2015-07-31 Thread Nikolay Bogoychev
Hey,

I have opposed this change in the past for two reasons:

Using more than 4 threads doesn't help unless the user is using
PhraseDictionaryCompact. See this issue
https://github.com/moses-smt/mosesdecoder/issues/39 in fact on most
machines you rarely want to run moses on all available threads.

Also - threads all picks up virtual (hyper) threads which are in fact
harmful to performance.

If you want to change the default I think it would be better to have a sane
default like 4.. It would boost performance for most people and if you run
it on machines with less available cores it would be not too bad.

Cheers,

Nick
On 31 Jul 2015 7:31 pm, "Hieu Hoang"  wrote:

> good suggestion. Changed:
>
> https://github.com/moses-smt/mosesdecoder/commit/f894dec0fd8d5b15eb16c35d3d2599338894ee9d
> if you have any more suggestions, it's best if you can just me a patch and
> I'll check it in
>
> On 31/07/2015 15:59, Vincent Nguyen wrote:
>
>
> for inexperienced people like me :)
> Add --decoder-flags="-threads 4"  is key
>
> if EMS config.basic had "-threads all" by default we would gain A LOT of
> time.
>
> cheers,
>
> Vincent
>
>
> Le 29/07/2015 22:05, Vincent Nguyen a écrit :
>
> Hi,
>
> I am wondering what tasks of the EMS are really parallelized.
> I activated the script line + 8 cores.
>
> Training / binarizing / Tuning all make only one core to actually work.
>
> Am I correct ?
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] nplm / Bilingual LM

2015-07-03 Thread Nikolay Bogoychev
Hey Marwa,

I can't reproduce the problem. Using latest moses git and nplm from
https://github.com/rsennrich/nplm it compiles just file and I get both
BilingualNPLM and NeuralLM FFs
I can suggest that you do a ./bjam clean and try recompiling again.

Cheers,

Nick

On Fri, Jul 3, 2015 at 12:19 PM, Marwa Refaie 
wrote:

> Hi
> From a month both the nplm & Bilingual NPLM was working great, suddenly
> now they seems not linked to moses! .. I try to recompile but
> always add nothing
>
>  ./bjam --with-nplm=~/nplm-master/src
>
> Tip: install tcmalloc for faster threading.  See BUILD-INSTRUCTIONS.txt
> for more information.
> mkdir: cannot create directory ‘bin’: File exists
> warning: No toolsets are configured.
> warning: Configuring default toolset "gcc".
> warning: If the default is wrong, your build may not work correctly.
> warning: Use the "toolset=x" option to override our guess.
> warning: For more configuration options, please consult
> warning:
> http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
> ...patience...
> ...patience...
> ...found 4625 targets...
> SUCCESS
>
> *Then when try the featyre "NeuralLM " or "BilingualNPLM"*
>
> Feature name NeuralLM is not registered.
> Feature name BilingualNPLM is not registered
>
> Any suggestion please ??
>
> Marwa N. Refaie
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] ProbingPT creation and factor support

2015-04-22 Thread Nikolay Bogoychev
Hey,

Probingpt supports reading from gzipped files but I don't think it supports
factors. I didn't code for factors specifically anyways. I am not sure if
the factor support has to be built into the phrase table or is independent
of it and is part of moses.

Cheers,

Nick
On 22 Apr 2015 6:03 pm, "Jeremy Gwinnup"  wrote:

> Hi,
>
> I’ve got a 2-part question:
>
> Does ProbingPT work on gzip’d phrase tables, and if so, does it support
> phrase tables with multiple factors?
>
> Thanks!
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] bilingual LM (nan nan nan)

2015-04-21 Thread Nikolay Bogoychev
Hey Marwa,
We have been having this problem with NPLM and we have found no "real
solution". There were couple of threads on the mailing list with this
problem so far. Basically the solution that we use is to lower the learning
rate (from 1 to .5. If .5 doesn't work to .25 and so on) and increase the
number of generations that you produce because of it. Alternatively you may
try to use the experimental gradient clipping code that Ashish implemented.
Here's a quote of his email:
>
> You should be able to download the version of the nplm where the updates
> (gradient*learning_rate) are clipped between +5 and -5
> http://www.isi.edu/~avaswani/nplm_clipped.tar.gz
> If you want to change the magnitude of the update, please change it inside
> struct Clipper{
>   double operator() (double x) const {
> return std::min(5., std::max(x,-5.));
> //return(x);
>   }
> };
>
> in neuralClasses.h
> Right now, the clipping has been implemented only for standard SGD
> training, and not for adagrad or adadelta.


Cheers,

Nick

On Tue, Apr 21, 2015 at 6:17 AM, Marwa Refaie  wrote:

> Hi all
>
> When I train BilngualLM with large corpus it give 10 models.nplm filez
> with small numbers then alot if lines nan nan nan nan nan nan nan nan nan
> nan nan nan nan
> It works perfect with smaller corpus. Any suggestions plzzz
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] ProbingPT tests not building

2015-04-01 Thread Nikolay Bogoychev
Done.

Thanks for spotting it.

On Wed, Apr 1, 2015 at 4:49 PM, Jeroen Vermeulen <
j...@precisiontranslationtools.com> wrote:

> On 01/04/15 22:29, Nikolay Bogoychev wrote:
>
> > Those tests are indeed obsolete, I used them to test some behavior when
> > I was building probingPT but the function in question became part of the
> > HuffmanDecoder class as getTargetWordFromID
> > You don't need to build or worry about those tests (there isn't a
> > Jamfile in that directory) I still don't have __proper__ tests for
> > probingPT.
>
> Thanks for the quick response!
>
> If they're not being built, maybe it's better to delete both the tests
> in ProbingPT/tess/ then, so that they don't give people a false sense of
> security?
>
> They'll still be in revision control if you want to refer to them for
> writing new tests.  Somebody else who wants to write tests won't know
> that, but in my experience, it's often better for people to start from
> scratch in that situation anyway.
>
>
> Jeroen
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] ProbingPT tests not building

2015-04-01 Thread Nikolay Bogoychev
Hey Jeroen,

Those tests are indeed obsolete, I used them to test some behavior when I
was building probingPT but the function in question became part of the
HuffmanDecoder class as getTargetWordFromID
You don't need to build or worry about those tests (there isn't a Jamfile
in that directory) I still don't have __proper__ tests for probingPT.

Cheers,

Nick

On Wed, Apr 1, 2015 at 2:39 PM, Jeroen Vermeulen <
j...@precisiontranslationtools.com> wrote:

> Here's another case where I may be breaking things because I'm building
> manually, or something might be genuinely wrong.  For me,
> moses/TranslationModel/ProbingPT/tests/vocabid_test.cpp doesn't build.
>
> The problem is that main() calls getStringFromID(), which is not defined
> anywhere in the codebase that I can see.  Obsolete test?
>
>
> Jeroen
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Unicode Issues when Using Compact Phrase Table, Binaries vs. Own Build

2015-03-30 Thread Nikolay Bogoychev
Hey Венци,

Did you by any chance binarize your phrase tables from a raw text format or
from gunzip (or any other supported compressed text formats)? I recently
run into similar issues with my phrase table (ProbingPT)  if the input
phrase table had not been compressed during binary creation. I wasn't able
to trace the issue, i just make sure I gz any phrase table before
binarizing.

Cheers,

Nick

On Mon, Mar 30, 2015 at 10:11 AM, Marcin Junczys-Dowmunt  wrote:

>  Forgot to add that we use the compact phrase table and Moses on older
> and newer Ubuntu version with Arabic, Chinese, Korean, Japanese, Russian in
> both directions and no problems. Those puny German umlauts should not be a
> challenge. :)
>
> W dniu 30.03.2015 o 11:08, Marcin Junczys-Dowmunt pisze:
>
> Hi,
> the phrase-table and as far as I know Moses in general are
> unicode-agnostic, as long as you use utf-8. Input is handled as raw byte
> sequences, most of the time there are numeric identifiers only.
> Sounds more like a couple of messed up systems on your side, especially
> the part where self-compiled systems work or don't work. Cannot give you
> much more insight, unfortunately.
> Best,
> Marcin
>
> W dniu 30.03.2015 o 10:53, "Венцислав Жечев (Ventsislav Zhechev)" pisze:
>
> Hi all,
>
>  I’m having this really weird Unicode issue when using compact phrase
> tables that could be related to endianness somehow, but I’ve no idea how.
> I compiled the training tools from v3 on my Mac and built a few models
> using compact phrase (and reordering) tables and KenLM, including (for
> simplicity) a recasing model for DE (download it from
> https://autodesk.box.com/DE-Recaser). Things become strange when I try to
> use the models, though:
> 1. All works fine when I use the decoder binary I compiled myself on the
> Mac (10.10.2, self-built Boost 1.57)
>  2. Unicode input is not recognised when I use the binary from
> http://www.statmt.org/moses/RELEASE-3.0/binaries/macosx-yosemite/ i.e.
> words like ‘für’ or ‘ausführlich’ are marked as UNK.
> 3. Unicode input is not recognised when I use a binary I compiled myself
> on Ubuntu 12.04.5 (self-built Boost 1.57)
> 4. All  works fine when I use the binary from
> http://www.statmt.org/moses/RELEASE-3.0/binaries/linux-64bit/
>
>  I tested the above with the queryPhraseTableMin tool (rather than the
> decoder) and got the same results, which is what makes me think this could
> be somehow related to binary incompatibility with the way the phrase table
> is compacted. Haven’t investigated deeper than that, though.
>
>
>  Any clues?
> One would say, just use the Linux binary then on Linux... However, I have
> a number of CentOS/RHEL 5 and 6 boxes, where the pre-compiled binary
> doesn’t work, as the system glibc is too old. So there I need to compile
> Moses myself, but then Unicode isn’t recognised...
>
>
>
>  Cheers,
>
>   Ventzi
>
>  –––
> *Dr. Ventsislav Zhechev*
> Computational Linguist, Certified ScrumMaster®
> Platform Architecture and Technologies
> Localisation Services
>
>  *MAIN* +41 32 723 91 22
> *FAX* +41 32 723 93 99
>
>  *http://VentsislavZhechev.eu *
>
>  *Autodesk, Inc.*
> Rue de Puits-Godet 6
> 2000 Neuchâtel, Switzerland
> *www.autodesk.com *
>
>
>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Forbidden link to binaries

2015-03-19 Thread Nikolay Bogoychev
Hey Per,

The link seems to be outdated, as it points to RELEASE-1.0. You can find
the current ones here:
http://www.statmt.org/moses/RELEASE-3.0/binaries/

Cheers,

Nick

On Thu, Mar 19, 2015 at 2:22 PM, Per Tunedal 
wrote:

> Hi,
> I just read the page http://www.statmt.org/moses/?n=Moses.Releases and
> tried the link to the binaries:
>
> All the binary executables are made available for download for users who
> do not wish to compile their own version.
>
> Clicking on download gets me to the page
> http://www.statmt.org/moses/RELEASE-1.0/binaries/
> showing the message:
>
> Forbidden
>
> You don't have permission to access /moses/RELEASE-1.0/binaries/ on this
> server.
>
> Yours,
> Per Tunedal
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] where is "premultiply" member of class "neuralLM" ?

2015-02-08 Thread Nikolay Bogoychev
Hey,
I replied to your previous query but I got mail delivery failiure
notification, so I am trying again:

In order to use NPLM with moses you should use this fork of NPLM:
https://github.com/rsennrich/nplm

Cheers,

Nick

On Sun, Feb 8, 2015 at 9:38 AM, Jianri Li  wrote:

>  Hi, all
>
> I resend the mail in case my description is not clear.
>
> When I compile MOSES with nplm, I got a error message like following:
>
> moses/LM/NeuralLMWrapper.cpp:37:22: error: ‘class nplm::neuralLM’ has no
> member named ‘premultiply’
>
> then I looked up the file "moses/LM/NeuralLMWrapper.cpp", found the code
> like this:
> ---
> #include "NeuralLMWrapper.h"
> #include "neuralLM.h"
>
> ... ...
>
>   m_neuralLM_shar! ed = new nplm::neuralLM();
>   m_neuralLM_shared->read(m_filePath);
>   m_neuralLM_shared->premultiply();
>
> ... ...
>
> --
> obviously it calls the class member "premultiply" of class "neuralLM",
> which is used for pre-computation if there is only one hidden layer.
> However when I go back to the nplm folder and found none of the following
> header files or cpp files contain any member named "premultiply",
>
> "neuralClasses.h",
> "neuralLM.h",
> "neuralNetwork.h",
>
> Of course the nplm and moses are both lastest version.
> I am now really confused about this.
> I know the moses has supported the nplm for several months already, but I
> cannot find any similar problem in moses mail-list history or through
> Googling.
> Did I missed something or shoud I write the "premulply" member by myself?
> I guess it is not a serious problem and I just didn't get it.
> Thank you for your attention.
>
> Helson
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Error while compile with nplm

2015-02-07 Thread Nikolay Bogoychev
Hey,

In order to use NPLM with moses you should use this fork of NPLM:
https://github.com/rsennrich/nplm

Cheers,

Nick

On Sat, Feb 7, 2015 at 6:22 PM, Jianri Li  wrote:

>  Hi, moses users
>   I was trying to compile with nplm, but I got some errors like this:
> --
> moses/LM/NeuralLMWrapper.cpp:37:22: error: ‘class nplm::neuralLM’ has no
> member named ‘premultiply’
> ! ---
>   I check the source code in nplm, actually there is no premultiply ...
>   I downloaded the code from http://nlg.isi.edu/software/nplm/ which is
> listed in the moses homepage (NPLM) .
>  And my compile option is :  ./bjam --with-boost=/path-to/boost_1_55_0
> --with-cmph=/path-to/cmph-2.0 --with-nplm=/path-to/nplm -j24
>  If you have any idea, please help me.
>  Thank you.
>
> Helson
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] nplm Building LM

2015-01-12 Thread Nikolay Bogoychev
Hey,

Refer to the moses documentation on how to use NPLM LM during decoding:
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc31
In particular you need to add this:

NeuralLM factor= order= path=filename


To your moses.ini where filename is model.NUMBER.

The 10 files model.1, model.2 etc are the neural network LM output after
each iteration/generation of training. So model.1 is the first generation
and model.10 is the 10th generation.

Cheers,

Nick


On Mon, Jan 12, 2015 at 12:05 AM, Marwa Refaie 
wrote:

>
>
>
>
>  Hi
>
> Please I need any step by step tutorial for the nplm .
> I compiled the package & make trainnueralnetworkLM , then I got the
> validation.ngrams & train.ngrams then I got 10 files , model.1 model.2
> ... model.10.
>
> I ran the ./bjam --with nplm 
>
> Then what to do next now ??
>
> Please any help ??
>
> *Marwa N. Refaie*
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] how to compile with nplm library

2014-12-29 Thread Nikolay Bogoychev
Hey,

First you need to checkout and compile this fork of nplm:
https://github.com/rsennrich/nplm

Then you need to compile moses with nplm switch:
./bjam --with-nplm=path/to/nplm

Then you can see how to use it here
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc31
On 30 Dec 2014 06:28, "Xiaoqiang Feng"  wrote:

> Hi,
>
> nplm is one toolkit of neural probabilistic language model. This toolkit
> can be used in Moses for language model and bilingual LM(neural network
> joint model, ACL 2014). These two parts have been updated in github
> mosesdecoder.
>
> If you want to use nplm in Moses, you have to compile Moses by linking
> libnplm.a (generated by nplm).
> Here is the probelm : how to compile Moses with libnplm.a ? Do I need to
> modify the Jamroot file and how to modify ?
>
> Thanks,
> Xiaoqiang Feng
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Delvin et al 2014

2014-11-26 Thread Nikolay Bogoychev
Hey,

I can only answer 5-7
5. The alignment file is the one that's usually
called aligned.1.grow-diag-final-and and contains lines such as:

0-0 1-1 2-2 3-3
0-0 1-1 2-2 3-3 4-4

6. Yes. Basically prune vocab value of 16000 would take the 16000 most
common words in the corpus and discard the rest (replace them with UNK)
7. Yes

Cheers,

Nick

On Wed, Nov 26, 2014 at 3:44 PM, Tom Hoar <
tah...@precisiontranslationtools.com> wrote:

>  Thanks again. It's very useful feedback. We're now preparing to move from
> v1.0 to 3.x. We skipped Moses 2.x. So, I'm not familiar with the new
> moses.ini syntax.
>
> Here are some more questions to help us get started playing with the
> extract_training.py options:
>
>1. I'm assuming corpus.e and corpus.f are the same prepared corpus
>files as used in train-model.perl?
>2. Is it possible for corpus.e and corpus.f to be different from the
>train-model.perl corpus, for example a smaller random sampling?
> 3. The corpus files are tokenized and lower-cased and escaped the
>same.
>4. Do the corpus files also need to enforce clean-corpus-n.perl max
>tokens (100) and ratio (9:1) for src & tgt? These address (M)GIZA++ limits
>and might not apply to BilingualLM. However, are there advantages to using
>the limits or disadvantages to overriding them? I.e. can these corpus files
>include lines that are filtered with clean-corpus-n.perl?
> 5. What is the --align value? Is it the output of train-model.perl
>step 3 or an file with word alignments for each line of the corpus.e and
>corpus.f pair?
>6. Re --prune-source-vocab & --prune-target-vocab, do these thresholds
>set the size of the vocabulary you reference in #4 below (i.e. 16K, 500K,
>etc)?
>7. Re --source-context & --target-context, are these the BilingualLM
>equivalents to a typical LM's order or ngrams for each?
>8. Re --tagged-corpus, is this for POS factored corpora?
>
> Thanks.
>
>
>
> On 11/26/2014 09:27 PM, Nikolay Bogoychev wrote:
>
> Hey, Tom
>
>  1) It's independent. You just add -with-oxlm and -with-nplm to the stack
> 2) Yes, they are both thread safe, you can run the decoder with however
> many threads you wish.
> 3) It doesn't create a separate binary. The compilation flag adds a new
> feature inside moses that is called BilingualNPLM and you have to add it to
> your moses.ini with a weight.
> 4) That depends on the vocabulary size used. With 16k source 16k target
> about 100 megabytes. With 50 about 1.5 gigabytes.
>
>  Beware that the memory requirements during decoding are much larger,
> because of premultiplication. If you have memory issues supply
> "premultiply=false" to the BilingualNPLM line in moses.ini, but this is
> likely going to slow down decoding by a lot.
>
>
>  Cheers,
>
>  Nick
>
> On Wed, Nov 26, 2014 at 2:09 PM, Tom Hoar <
> tah...@precisiontranslationtools.com> wrote:
>
>>  Thanks Nikolay! This is a great start. I have a few clarification
>> questions.
>>
>> 1) does this replace or run independently of traditional language models
>> like KenLM? I.e. when compiling, we can use -with-kenlm, -with-irstlm,
>> -with-randlm and -with-srilm together. Are -with-oxlm and -with-nplm added
>> to the stack or are they exclusive?
>>
>> 2) It looks like your branch of nplm is thread-safe. Is oxlm also
>> thread-safe?
>>
>> 3) You say, "To run it in moses as a feature function..." Does that mean
>> compiling with your above option(s) creates a new runtime binary "
>> BilingualNPLM" that replaces the moses binary, much like moseschart and
>> mosesserver? Or, does BilingualNPLM run in a separate process that the
>> Moses binary accesses during runtime?
>>
>> 4) How large do these LM files become? Are they comparable to traditional
>> ARPA files, larger or smaller? Also, are they binarized with mmap reads or
>> do they have to load into RAM?
>>
>> Thanks,
>> Tom
>>
>>
>>
>>
>>
>> On 11/26/2014 08:04 PM, Nikolay Bogoychev wrote:
>>
>>  Fix formatting...
>>
>>  Hey,
>>
>>  BilingualLM is implemented and as of last week resides within moses
>> master:
>> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp
>>
>>  To compile it you need a NeuralNetwork backend for it. Currently there
>> are two supported: Oxlm and Nplm. Adding a new backend is relatively easy,
>> you need to implement the interface as shown here:
>>
>> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/

Re: [Moses-support] Delvin et al 2014

2014-11-26 Thread Nikolay Bogoychev
Hey, Tom

1) It's independent. You just add -with-oxlm and -with-nplm to the stack
2) Yes, they are both thread safe, you can run the decoder with however
many threads you wish.
3) It doesn't create a separate binary. The compilation flag adds a new
feature inside moses that is called BilingualNPLM and you have to add it to
your moses.ini with a weight.
4) That depends on the vocabulary size used. With 16k source 16k target
about 100 megabytes. With 50 about 1.5 gigabytes.

Beware that the memory requirements during decoding are much larger,
because of premultiplication. If you have memory issues supply
"premultiply=false" to the BilingualNPLM line in moses.ini, but this is
likely going to slow down decoding by a lot.


Cheers,

Nick

On Wed, Nov 26, 2014 at 2:09 PM, Tom Hoar <
tah...@precisiontranslationtools.com> wrote:

>  Thanks Nikolay! This is a great start. I have a few clarification
> questions.
>
> 1) does this replace or run independently of traditional language models
> like KenLM? I.e. when compiling, we can use -with-kenlm, -with-irstlm,
> -with-randlm and -with-srilm together. Are -with-oxlm and -with-nplm added
> to the stack or are they exclusive?
>
> 2) It looks like your branch of nplm is thread-safe. Is oxlm also
> thread-safe?
>
> 3) You say, "To run it in moses as a feature function..." Does that mean
> compiling with your above option(s) creates a new runtime binary "
> BilingualNPLM" that replaces the moses binary, much like moseschart and
> mosesserver? Or, does BilingualNPLM run in a separate process that the
> Moses binary accesses during runtime?
>
> 4) How large do these LM files become? Are they comparable to traditional
> ARPA files, larger or smaller? Also, are they binarized with mmap reads or
> do they have to load into RAM?
>
> Thanks,
> Tom
>
>
>
>
>
> On 11/26/2014 08:04 PM, Nikolay Bogoychev wrote:
>
>  Fix formatting...
>
>  Hey,
>
>  BilingualLM is implemented and as of last week resides within moses
> master:
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp
>
>  To compile it you need a NeuralNetwork backend for it. Currently there
> are two supported: Oxlm and Nplm. Adding a new backend is relatively easy,
> you need to implement the interface as shown here:
>
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h
>
>  To compile with oxlm backend you need to compile moses with the switch
> -with-oxlm=/path/to/oxlm
> To compile with nplm backend you need to compile moses with the switch
> -with-nplm=/path/to/nplm (You need this fork of nplm
> https://github.com/rsennrich/nplm
>
>  Unfortunately documentaiton is not yet available so here's a short
> summary how to train a model and use it using, the nplm backend:
> Use the extract training script to prepare aligned bilingual corpus:
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py
>
>  You need the following options:
>
>  "-e", "--target-language", type="string", dest="target_language")
> //Mandatory, for example es "-f", "--source-language", type="string",
> dest="source_language") //Mandatory, for example en "-c", "--corpus",
> type="string", dest="corpus_stem") // path/to/corpus In the directory you
> have specified there should be files corpus.sourcelang and
> corpus.targetlang "-t", "--tagged-corpus", type="string",
> dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
> type="string", dest="align_file") //Mandatory alignemtn file "-w",
> "--working-dir", type="string", dest="working_dir") //Output directory of
> the model "-n", "--target-context", type="int", dest="n") / "-m",
> "--source-context", type="int", dest="m") //The actual context size is 2*m
> + 1, this is the number of words on both left and right "-s",
> "--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
> threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
> vocabulary threshold
>
>  Then, use the training script to train the model:
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py
>
> Example execution is:
>
>  train_nplm.py -w de-en-500250source/ -r de-en150nopos-source750 -n 16 -d
> 0 --nplm-home=/home/abmayne/code/deepathon/npl

Re: [Moses-support] Delvin et al 2014

2014-11-26 Thread Nikolay Bogoychev
Fix formatting...

Hey,

BilingualLM is implemented and as of last week resides within moses master:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp

To compile it you need a NeuralNetwork backend for it. Currently there are
two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
need to implement the interface as shown here:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h

To compile with oxlm backend you need to compile moses with the switch
-with-oxlm=/path/to/oxlm
To compile with nplm backend you need to compile moses with the switch
-with-nplm=/path/to/nplm (You need this fork of nplm
https://github.com/rsennrich/nplm

Unfortunately documentaiton is not yet available so here's a short summary
how to train a model and use it using, the nplm backend:
Use the extract training script to prepare aligned bilingual corpus:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py

You need the following options:

"-e", "--target-language", type="string", dest="target_language")
//Mandatory, for example es "-f", "--source-language", type="string",
dest="source_language") //Mandatory, for example en "-c", "--corpus",
type="string", dest="corpus_stem") // path/to/corpus In the directory you
have specified there should be files corpus.sourcelang and
corpus.targetlang "-t", "--tagged-corpus", type="string",
dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
type="string", dest="align_file") //Mandatory alignemtn file "-w",
"--working-dir", type="string", dest="working_dir") //Output directory of
the model "-n", "--target-context", type="int", dest="n") / "-m",
"--source-context", type="int", dest="m") //The actual context size is 2*m
+ 1, this is the number of words on both left and right "-s",
"--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
vocabulary threshold

Then, use the training script to train the model:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py

Example execution is:

train_nplm.py -w de-en-500250source/ -r de-en150nopos-source750 -n 16 -d 0
--nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
-i 750 -o 750

where -i and -o are input and output embeddings
 -n is the total ngram size
 -d is the number of hidden layyers
-w and -c are the same as the extract_training options
-r is the output directory of the model

Consult the python script for more detailed description of the options

After you have done that in the output directory you should have a trained
bilingual Neural Network language model

To run it in moses as a feature function you need the following line:

BilingualNPLM 
filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10
target_ngrams=4 source_ngrams=9 source_vocab=/mnt/gna0/
nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source
target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe

The source and target vocab is located in the working directory used to
prepare the neural network language model.
target_ngrams doesn't include the predicted word (so target_ngrams = 4,
would mean 1 word predicted and 4 target context word)
The total of the model would target_ngrams + source_ngrams + 1)

I will write a proper documentation  in the following weeks. If you have
any problems runnning it, please consult me.

Cheers,

Nick


On Wed, Nov 26, 2014 at 1:02 PM, Nikolay Bogoychev  wrote:

> Hey,
>
> BilingualLM is implemented and as of last week resides within moses
> master:
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp
>
> To compile it you need a NeuralNetwork backend for it. Currently there are
> two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
> need to implement the interface as shown here:
>
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h
>
> To compile with oxlm backend you need to compile moses with the switch
> -with-oxlm=/path/to/oxlm
> To compile with nplm backend you need to compile moses with the switch
> -with-nplm=/path/to/nplm (You need this fork of nplm
> https://github.com/rsennrich/nplm
>
> Unfortunately documentaiton is not yet available so here's a short summary
> how to train a model and use it using, the nplm backend:
> Use the 

Re: [Moses-support] Delvin et al 2014

2014-11-26 Thread Nikolay Bogoychev
Hey,

BilingualLM is implemented and as of last week resides within moses master:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp

To compile it you need a NeuralNetwork backend for it. Currently there are
two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
need to implement the interface as shown here:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h

To compile with oxlm backend you need to compile moses with the switch
-with-oxlm=/path/to/oxlm
To compile with nplm backend you need to compile moses with the switch
-with-nplm=/path/to/nplm (You need this fork of nplm
https://github.com/rsennrich/nplm

Unfortunately documentaiton is not yet available so here's a short summary
how to train a model and use it using, the nplm backend:
Use the extract training script to prepare aligned bilingual corpus:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py

You need the following options:

"-e", "--target-language", type="string", dest="target_language")
//Mandatory, for example es "-f", "--source-language", type="string",
dest="source_language") //Mandatory, for example en "-c", "--corpus",
type="string", dest="corpus_stem") // path/to/corpus In the directory you
have specified there should be files corpus.sourcelang and
corpus.targetlang "-t", "--tagged-corpus", type="string",
dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
type="string", dest="align_file") //Mandatory alignemtn file "-w",
"--working-dir", type="string", dest="working_dir") //Output directory of
the model "-n", "--target-context", type="int", dest="n") / "-m",
"--source-context", type="int", dest="m") //The actual context size is 2*m
+ 1, this is the number of words on both left and right "-s",
"--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
vocabulary threshold
Then, use the training script to train the model:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py

Example execution is: train_nplm.py -w de-en-500250source/ -r
de-en150nopos-source750 -n 16 -d 0
--nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
-i 750 -o 750

where -i and -o are input and output embeddings
 -n is the total ngram size
 -d is the number of hidden layyers
-w and -c are the same as the extract_training options
-r is the output directory of the model

Consult the python script for more detailed description of the options

After you have done that in the output directory you should have a trained
bilingual Neural Network language model

To run it in moses as a feature function you need the following line:

BilingualNPLM
filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10
target_ngrams=4 source_ngrams=9
source_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source
target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe

The source and target vocab is located in the working directory used to
prepare the neural network language model.
target_ngrams doesn't include the predicted word (so target_ngrams = 4,
would mean 1 word predicted and 4 target context word)
The total of the model would target_ngrams + source_ngrams + 1)

I will write a proper documentation  in the following weeks. If you have
any problems runnning it, please consult me.

Cheers,

Nick




On Wed, Nov 26, 2014 at 11:53 AM, Tom Hoar <
tah...@precisiontranslationtools.com> wrote:

>  Hieu,
>
> Sorry I missed you in Vancouver. I just reviewed your slide deck from the
> MosesCore TAUS Round Table in Vancouver
> (taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh).
>
>
> In particular, I'm interested in the "Bilingual Language Models" that
> "replicate Delvin et al, 2014". A search on statmt.org/moses doesn't show
> any hits searching for "delvin". So, A) is the code finished? If so B) are
> there any instructions how to enable/use this feature? If not, C) what kind
> of help do you need to test the code for release?
>
> --
>
> Best regards,
> Tom Hoar
> Managing Director
> *Precision Translation Tools Co., Ltd.*
> Bangkok, Thailand
> Web: www.precisiontranslationtools.com
> Mobile: +66 87 345-1875
> Skype: tahoar
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses profiling

2014-09-15 Thread Nikolay Bogoychev
If you want to use google-perftools for profiling
http://code.google.com/p/gperftools/wiki/GooglePerformanceTools
compile moses with:
./bjam --full-tcmalloc link=shared

On Sat, Sep 13, 2014 at 9:54 PM, Arturo Argueta 
wrote:

> Is there any way to enable profiling on moses? I've heard that one
> modification on one bjam can enable profiling on moses
>
> Thanks
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support