
BilingualLM is implemented and as of last week resides within moses master:

To compile it you need a NeuralNetwork backend for it. Currently there are
two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
need to implement the interface as shown here:

To compile with oxlm backend you need to compile moses with the switch
To compile with nplm backend you need to compile moses with the switch
-with-nplm=/path/to/nplm (You need this fork of nplm

Unfortunately documentaiton is not yet available so here's a short summary
how to train a model and use it using, the nplm backend:
Use the extract training script to prepare aligned bilingual corpus:

You need the following options:

"-e", "--target-language", type="string", dest="target_language")
//Mandatory, for example es "-f", "--source-language", type="string",
dest="source_language") //Mandatory, for example en "-c", "--corpus",
type="string", dest="corpus_stem") // path/to/corpus In the directory you
have specified there should be files corpus.sourcelang and
corpus.targetlang "-t", "--tagged-corpus", type="string",
dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
type="string", dest="align_file") //Mandatory alignemtn file "-w",
"--working-dir", type="string", dest="working_dir") //Output directory of
the model "-n", "--target-context", type="int", dest="n") / "-m",
"--source-context", type="int", dest="m") //The actual context size is 2*m
+ 1, this is the number of words on both left and right "-s",
"--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
vocabulary threshold
Then, use the training script to train the model:

Example execution is: train_nplm.py -w de-en-500250source/ -r
de-en150nopos-source750 -n 16 -d 0
--nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
-i 750 -o 750

where -i and -o are input and output embeddings
 -n is the total ngram size
 -d is the number of hidden layyers
-w and -c are the same as the extract_training options
-r is the output directory of the model

Consult the python script for more detailed description of the options

After you have done that in the output directory you should have a trained
bilingual Neural Network language model

To run it in moses as a feature function you need the following line:

target_ngrams=4 source_ngrams=9

The source and target vocab is located in the working directory used to
prepare the neural network language model.
target_ngrams doesn't include the predicted word (so target_ngrams = 4,
would mean 1 word predicted and 4 target context word)
The total of the model would target_ngrams + source_ngrams + 1)

I will write a proper documentation  in the following weeks. If you have
any problems runnning it, please consult me.



On Wed, Nov 26, 2014 at 11:53 AM, Tom Hoar <
tah...@precisiontranslationtools.com> wrote:

>  Hieu,
> Sorry I missed you in Vancouver. I just reviewed your slide deck from the
> MosesCore TAUS Round Table in Vancouver
> (taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh).
> In particular, I'm interested in the "Bilingual Language Models" that
> "replicate Delvin et al, 2014". A search on statmt.org/moses doesn't show
> any hits searching for "delvin". So, A) is the code finished? If so B) are
> there any instructions how to enable/use this feature? If not, C) what kind
> of help do you need to test the code for release?
> --
> Best regards,
> Tom Hoar
> Managing Director
> *Precision Translation Tools Co., Ltd.*
> Bangkok, Thailand
> Web: www.precisiontranslationtools.com
> Mobile: +66 87 345-1875
> Skype: tahoar
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
Moses-support mailing list

Reply via email to