Re: [Moses-support] Language Model Inquiry

2021-05-01 Thread Marwa Gaser
Then which numbers do I use for IRSTLM and SRILM?

On Thu, 29 Apr 2021 at 7:10 PM Hieu Hoang  wrote:

>
> On 4/29/2021 5:27 AM, Marwa Gaser wrote:
>
> Hello,
>
> In the baseline training, what do the numbers in the below line represent?
>
>
> 3 for the 3-gram?
>
> yes
>
> How about 0 and 8?
>
> 0 means that the LM over the surface words. If your output has other
> factors, eg. Je|PRO suis|VB etudiant|ADJ, you can choose to have the LM on
> factor 1
>
> 8 means it uses KenLM, as opposed to SRILM or IRSTLM.
>
>
> -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Hieu Hoanghttp://statmt.org/hieu
>
> --
Sent from Gmail Mobile
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Inquiry

2021-04-29 Thread Hieu Hoang


On 4/29/2021 5:27 AM, Marwa Gaser wrote:

Hello,

In the baseline training, what do the numbers in the below line 
represent?



3 for the 3-gram?

yes

How about 0 and 8?


0 means that the LM over the surface words. If your output has other 
factors, eg. Je|PRO suis|VB etudiant|ADJ, you can choose to have the LM 
on factor 1


8 means it uses KenLM, as opposed to SRILM or IRSTLM.



-lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


--
Hieu Hoang
http://statmt.org/hieu

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-07-01 Thread Mathias Müller
Thanks Philipp and Kenneth!

So, does this mean that finding the weights and log-linear interpolation of
LMs is actually implemented in KenLM, but there is no ready-made,
higher-level script to use this functionality, as there is for SRILM
(interpolate-lm.perl)?

@Kenneth Since KenLM is already distributed with Moses, why do you
recommend that I compile the code separately again? Does
github.com/kpu/kenlm have different code than what comes with Moses?

Thanks again,
Mathias

On Tue, Jun 28, 2016 at 6:08 PM, Kenneth Heafield 
wrote:

> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.
>
> Tuning log-linear weights is super slow, but applying them is reasonably
> fast.  In total the tuning + applying weights time is comparable to SRILM.
>
> https://kheafield.com/professional/edinburgh/interpolate_paper.pdf
>
> Kenneth
>
> On 06/28/2016 03:27 PM, Philipp Koehn wrote:
> > Hi,
> >
> > unfortunately, the interpolation of language models requires two pieces
> > of code that only exist in SRILM: The EM training method to find weights
> > for the language models, and the linear interpolation of the language
> > models.
> >
> > Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
> > available.
> >
> > -phi
> >
> > On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller  > > wrote:
> >
> > Hi all
> >
> > I have trained several language models and would like to combine
> > them with interpolate-lm.perl:
> >
> >
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
> >
> > As the language model tool, I always use KenLM, but looking at the
> > code of interpolate-lm.perl, it seems that the use of SRILM is
> > hard-coded in the script. I would like to avoid SRILM because, if I
> > understand correctly, its license does not permit use in commercial
> > products.
> >
> > My question is:
> >
> > Can I simply replace the call to SRILM with KenLM in my copy of
> > interpolate-lm.perl? Does KenLM have the functionality necessary for
> > language model combination, e.g. a substitute for SRILM's
> > "compute-best-mix"?
> >
> > Thanks for your help.
> > Mathias
> >
> > —
> >
> > Mathias Müller
> > AND-2-20
> > Institute of Computational Linguistics
> > University of Zurich
> > +41 44 635 75 81 
> > mathias.muel...@uzh.ch 
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu 
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-06-30 Thread Kenneth Heafield
It's new.  There are some rough edges like memory budgeting.  Also, I'd
argue there is less need for a script since there is one integrated
program that takes models, tunes, and generates the combined model
(though you can split it into steps if you'd like).

Another thing to note: you'll need to generate the component models
using lmplz with the "intermediate" binary format.  This is done by
adding "--intermediate $file_name" to the lmplz arguments.  I'd like to
kill the intermediate format and have lmplz generate a trie, but it's a
matter of getting code bandwidth.

Kenneth

On 06/30/2016 01:30 PM, Mathias Müller wrote:
> Thanks Philipp and Kenneth!
> 
> So, does this mean that finding the weights and log-linear interpolation
> of LMs is actually implemented in KenLM, but there is no ready-made,
> higher-level script to use this functionality, as there is for SRILM
> (interpolate-lm.perl)?
> 
> @Kenneth Since KenLM is already distributed with Moses, why do you
> recommend that I compile the code separately again? Does
> github.com/kpu/kenlm  have different code
> than what comes with Moses?
> 
> Thanks again,
> Mathias
> 
> On Tue, Jun 28, 2016 at 6:08 PM, Kenneth Heafield  > wrote:
> 
> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm
>  and compile with Eigen.
> 
> Tuning log-linear weights is super slow, but applying them is reasonably
> fast.  In total the tuning + applying weights time is comparable to
> SRILM.
> 
> https://kheafield.com/professional/edinburgh/interpolate_paper.pdf
> 
> Kenneth
> 
> On 06/28/2016 03:27 PM, Philipp Koehn wrote:
> > Hi,
> >
> > unfortunately, the interpolation of language models requires two pieces
> > of code that only exist in SRILM: The EM training method to find weights
> > for the language models, and the linear interpolation of the language
> > models.
> >
> > Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
> > available.
> >
> > -phi
> >
> > On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller 
> mailto:mathias.muel...@uzh.ch>
> > >> wrote:
> >
> > Hi all
> >
> > I have trained several language models and would like to combine
> > them with interpolate-lm.perl:
> >
> > 
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
> >
> > As the language model tool, I always use KenLM, but looking at the
> > code of interpolate-lm.perl, it seems that the use of SRILM is
> > hard-coded in the script. I would like to avoid SRILM because, if I
> > understand correctly, its license does not permit use in commercial
> > products.
> >
> > My question is:
> >
> > Can I simply replace the call to SRILM with KenLM in my copy of
> > interpolate-lm.perl? Does KenLM have the functionality necessary for
> > language model combination, e.g. a substitute for SRILM's
> > "compute-best-mix"?
> >
> > Thanks for your help.
> > Mathias
> >
> > —
> >
> > Mathias Müller
> > AND-2-20
> > Institute of Computational Linguistics
> > University of Zurich
> > +41 44 635 75 81 
> 
> > mathias.muel...@uzh.ch 
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu 
> >
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu 
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Kenneth Heafield
Oh also, use a small -S argument to the interpolate program because it
doesn't quite budget memory properly yet.

On 06/28/2016 05:08 PM, Kenneth Heafield wrote:
> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.
> 
> Tuning log-linear weights is super slow, but applying them is reasonably
> fast.  In total the tuning + applying weights time is comparable to SRILM.
> 
> https://kheafield.com/professional/edinburgh/interpolate_paper.pdf
> 
> Kenneth
> 
> On 06/28/2016 03:27 PM, Philipp Koehn wrote:
>> Hi,
>>
>> unfortunately, the interpolation of language models requires two pieces
>> of code that only exist in SRILM: The EM training method to find weights
>> for the language models, and the linear interpolation of the language
>> models.
>>
>> Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
>> available.
>>
>> -phi
>>
>> On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller > > wrote:
>>
>> Hi all
>>
>> I have trained several language models and would like to combine
>> them with interpolate-lm.perl:
>>
>> 
>> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
>>
>> As the language model tool, I always use KenLM, but looking at the
>> code of interpolate-lm.perl, it seems that the use of SRILM is
>> hard-coded in the script. I would like to avoid SRILM because, if I
>> understand correctly, its license does not permit use in commercial
>> products.
>>
>> My question is:
>>
>> Can I simply replace the call to SRILM with KenLM in my copy of
>> interpolate-lm.perl? Does KenLM have the functionality necessary for
>> language model combination, e.g. a substitute for SRILM's
>> "compute-best-mix"?
>>
>> Thanks for your help.
>> Mathias
>>
>> —
>>
>> Mathias Müller
>> AND-2-20
>> Institute of Computational Linguistics
>> University of Zurich
>> +41 44 635 75 81 
>> mathias.muel...@uzh.ch 
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu 
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Kenneth Heafield
Log-linear interpolation is in KenLM in the lm/interpolate directory.
You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.

Tuning log-linear weights is super slow, but applying them is reasonably
fast.  In total the tuning + applying weights time is comparable to SRILM.

https://kheafield.com/professional/edinburgh/interpolate_paper.pdf

Kenneth

On 06/28/2016 03:27 PM, Philipp Koehn wrote:
> Hi,
> 
> unfortunately, the interpolation of language models requires two pieces
> of code that only exist in SRILM: The EM training method to find weights
> for the language models, and the linear interpolation of the language
> models.
> 
> Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
> available.
> 
> -phi
> 
> On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller  > wrote:
> 
> Hi all
> 
> I have trained several language models and would like to combine
> them with interpolate-lm.perl:
> 
> 
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
> 
> As the language model tool, I always use KenLM, but looking at the
> code of interpolate-lm.perl, it seems that the use of SRILM is
> hard-coded in the script. I would like to avoid SRILM because, if I
> understand correctly, its license does not permit use in commercial
> products.
> 
> My question is:
> 
> Can I simply replace the call to SRILM with KenLM in my copy of
> interpolate-lm.perl? Does KenLM have the functionality necessary for
> language model combination, e.g. a substitute for SRILM's
> "compute-best-mix"?
> 
> Thanks for your help.
> Mathias
> 
> —
> 
> Mathias Müller
> AND-2-20
> Institute of Computational Linguistics
> University of Zurich
> +41 44 635 75 81 
> mathias.muel...@uzh.ch 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Philipp Koehn
Hi,

unfortunately, the interpolation of language models requires two pieces of
code that only exist in SRILM: The EM training method to find weights for
the language models, and the linear interpolation of the language models.

Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
available.

-phi

On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller 
wrote:

> Hi all
>
> I have trained several language models and would like to combine them with
> interpolate-lm.perl:
>
>
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
>
> As the language model tool, I always use KenLM, but looking at the code of
> interpolate-lm.perl, it seems that the use of SRILM is hard-coded in the
> script. I would like to avoid SRILM because, if I understand correctly, its
> license does not permit use in commercial products.
>
> My question is:
>
> Can I simply replace the call to SRILM with KenLM in my copy of
> interpolate-lm.perl? Does KenLM have the functionality necessary for
> language model combination, e.g. a substitute for SRILM's
> "compute-best-mix"?
>
> Thanks for your help.
> Mathias
>
> —
>
> Mathias Müller
> AND-2-20
> Institute of Computational Linguistics
> University of Zurich
> +41 44 635 75 81
> mathias.muel...@uzh.ch
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model question

2015-11-26 Thread Dingyuan Wang
Hi,

I tend to fix it in the tokenization script, or I would solve this in some
pre-processing scripts if there are any obvious patterns in the noise.

--
Dingyuan
2015年11月26日 21:09於 "Vincent Nguyen" 寫道:

> Hi all,
>
> I have a question regarding LMs.
>
> Let's take the example of news.2014.shuffle.en
>
> When we process it through punctuation normalization for english
> language, it will for instance put a " " before an apostrophe
> "it is'nt" = > "it is 'nt"
>
> BUT it contains some noise, for instance there is some french sentences
> in the corpus, for which the apostrophe process will not be suited
> "j'aime" => "j 'aime" => it will create the token 'aime
>
> My point is the following,
>
> At stage of LM building, how can we prune to eliminate such token like
> "'aime" so that it does not create wrong uni-grams, nor bi-grams, ...
>
> the ngram -minprune only take 2 as a minimum so wrong unigrams will
> still be taken in the LM.
>
>
> Hope I'm clear enough 
>
> Vincent
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model in Moses!

2015-11-02 Thread Hieu Hoang

make a copy of
   LM/SkeletonLM.*
Look at the code and change it to do whatever you want

On 02/11/2015 08:57, Vu Thuong Huyen wrote:


Dear Hieu,

I want to integrate new language model (like: recurrent neural network 
language model) into Moses. Could you tell me how to do?


I’m new in language model and statistical machine translation.

I’m looking forward to hearing from you.

Many thanks,

Huyen.



--
Hieu Hoang
http://www.hoang.co.uk/hieu

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model creation error

2015-08-04 Thread Hieu Hoang
when you compile with IRSTLM, you must get the latest version. The 
latest version is 5.80.08, from

   http://sourceforge.net/projects/irstlm/files/

On 01/08/2015 12:17, kalu mera wrote:

Dear Members,
I am trying to create a language model creation, I entered this command
kalumera@kalumera-Satellite-C50-A534:~/mosesdecoder$ ./bjam 
--with-boost=~/workspace/temp/boost_1_55_0 -j4


but the build failed

Please check the attachment for the command i entered and the error, 
and help advise me on how to rectify the problem


Christine


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Janez Kadivec
Hi,

I'm a beginnner in Linux... so I like to see things happening in the
foreground :)
If there is no need to change the command... don't do that... ;)

Obviously I overlooked and did not go through the section of copying the
giza++ utilities to the tools directory.

Thank you for your help.
Janez


On Wed, Mar 5, 2014 at 11:01 AM, Barry Haddow wrote:

> Hi Janez
>
>  In my opinion there are two things that need to be somehow described or
>> corrected in the Moses baseline:
>> 1. Notify the user about the location of the Giza++ utilities
>> (mosesdecoder/tools or mosesdecoder/giza++) and need to rename the folders
>> to the one used in command.
>>
>
> The instructions already ask you to copy the GIZA++ utilities to the tools
> directory, you must have missed out that step.
>
>  2. Remove the last "&" char in the command, listed in the baseline.
>>
>
> The "&" is correct, it runs the command in the background, which is why it
> returns instantly. This is quite a normal thing to do in a UNIX environment.
>
> cheers - Barry
>
>
> On 05/03/14 09:43, Janez Kadivec wrote:
>
>> Hi,
>>
>> thank you for your help. The added "yes" parameter resolved the
>> situation. We are following the Moses baseline, published in the official
>> Moses web site:
>> http://www.statmt.org/moses/?n=moses.baseline
>>
>> Please correct the last command in the Language Model Traning section.
>> The command is marked with red color.
>> mkdir ~/lm
>>   cd ~/lm
>>   ~/irstlm/bin/add-start-end.sh \
>> < ~/corpus/news-commentary-v8.fr-en.true.en \
>> > news-commentary-v8.fr-en.sb.en
>>   export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
>> -i news-commentary-v8.fr-en.sb.en  \
>> -t ./tmp  -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
>> *  ~/irstlm/bin/compile-lm --text news-commentary-v8.fr-en.lm.en.gz \
>> news-commentary-v8.fr-en.arpa.en*
>>
>> 
>> We followed the same baseline. In the Training the Translation System we
>> found the next inconsistency:
>> We installed the Moses and part of it was also Giza++ installed under the
>> ...mosesdecoder\giza++1.0.7.
>> We executed the following commands from the Training the Translation
>> System section:
>> mkdir ~/working
>>   cd ~/working
>>   nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir
>> train \
>>   -corpus ~/corpus/news-commentary-v8.fr-en.clean
>>   \
>>   -f fr -e en -alignment grow-diag-final-and -reordering
>> msd-bidirectional-fe \
>>   -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8
>>  \
>>   -external-bin-dir ~/mosesdecoder/tools >& training.out &
>> During the execution of the last command (marked with green color) here
>> was an error saying that the mkcls utility is not found.
>> It's not found because in the "initial" installation there is no tools
>> subdirectory. We renamed the Giza++107 directory to "tools".
>> The command was executed instantly with no results. So we removed the
>> last character "&" from the command. It's working now for about half an
>> hour. ;)
>>
>> In my opinion there are two things that need to be somehow described or
>> corrected in the Moses baseline:
>> 1. Notify the user about the location of the Giza++ utilities
>> (mosesdecoder/tools or mosesdecoder/giza++) and need to rename the folders
>> to the one used in command.
>> 2. Remove the last "&" char in the command, listed in the baseline.
>>
>> Have a nice rest of the day.
>> Janez
>>
>>
>> Seth syggested you the right fix
>>
>> I just checked the IRSTLM documentation
>> http://sourceforge.net/apps/mediawiki/irstlm/index.php?
>> title=Estimating_gigantic_models
>> and the correct notation is reported there.
>>
>> Could you please tell me from where do you get the "wrong" information
>> So that I correct it.
>>
>>
>> Nicola
>> (on behalf of IRSTLM development team)
>>
>>
>>
>> On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:
>>
>> First four commands were executed successfuly. The last one
>> failed. Here
>> is the result after entering the following command line:zzz 
>> zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
>> news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
>>
>> inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM
>> level 1000
>> (if any)
>> dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
>> zzz-laptop:~/lm$ Where we made a mistake? I see the
>> xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
>> output file?Best regards!
>>
>>
>> I was having the same problem when following the steps in the baseline
>> instructions but I was able to get it to work by adding "yes"
>> after --text.
>>
>> Try this:
>>
>> ~/moses/irstlm/bin/compile-lm --text yes
>> news-commentary-v8.fr-en.lm.en.gz
>> news-commentary-v8.fr-en.arpa.en
>>
>>
>> _

Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Barry Haddow
Hi Janez

> In my opinion there are two things that need to be somehow described 
> or corrected in the Moses baseline:
> 1. Notify the user about the location of the Giza++ utilities 
> (mosesdecoder/tools or mosesdecoder/giza++) and need to rename the 
> folders to the one used in command.

The instructions already ask you to copy the GIZA++ utilities to the 
tools directory, you must have missed out that step.

> 2. Remove the last "&" char in the command, listed in the baseline.

The "&" is correct, it runs the command in the background, which is why 
it returns instantly. This is quite a normal thing to do in a UNIX 
environment.

cheers - Barry


On 05/03/14 09:43, Janez Kadivec wrote:
> Hi,
>
> thank you for your help. The added "yes" parameter resolved the 
> situation. We are following the Moses baseline, published in the 
> official Moses web site:
> http://www.statmt.org/moses/?n=moses.baseline
>
> Please correct the last command in the Language Model Traning section. 
> The command is marked with red color.
> mkdir ~/lm
>   cd ~/lm
>   ~/irstlm/bin/add-start-end.sh \
> < ~/corpus/news-commentary-v8.fr-en.true.en \
> > news-commentary-v8.fr-en.sb.en
>   export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
> -i news-commentary-v8.fr-en.sb.en  \
> -t ./tmp  -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
> *  ~/irstlm/bin/compile-lm --text news-commentary-v8.fr-en.lm.en.gz \
> news-commentary-v8.fr-en.arpa.en*
>
> 
> We followed the same baseline. In the Training the Translation System 
> we found the next inconsistency:
> We installed the Moses and part of it was also Giza++ installed under 
> the ...mosesdecoder\giza++1.0.7.
> We executed the following commands from the Training the Translation 
> System section:
> mkdir ~/working
>   cd ~/working
>   nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train 
> \
>   -corpus ~/corpus/news-commentary-v8.fr-en.clean 
> \
>   -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe 
> \
>   -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8  
> \
>   -external-bin-dir ~/mosesdecoder/tools >& training.out &
> During the execution of the last command (marked with green color) 
> here was an error saying that the mkcls utility is not found.
> It's not found because in the "initial" installation there is no tools 
> subdirectory. We renamed the Giza++107 directory to "tools".
> The command was executed instantly with no results. So we removed the 
> last character "&" from the command. It's working now for about half 
> an hour. ;)
>
> In my opinion there are two things that need to be somehow described 
> or corrected in the Moses baseline:
> 1. Notify the user about the location of the Giza++ utilities 
> (mosesdecoder/tools or mosesdecoder/giza++) and need to rename the 
> folders to the one used in command.
> 2. Remove the last "&" char in the command, listed in the baseline.
>
> Have a nice rest of the day.
> Janez
>
>
> Seth syggested you the right fix
>
> I just checked the IRSTLM documentation
> 
> http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
> and the correct notation is reported there.
>
> Could you please tell me from where do you get the "wrong" information
> So that I correct it.
>
>
> Nicola
> (on behalf of IRSTLM development team)
>
>
>
> On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:
>
> First four commands were executed successfuly. The last one
> failed. Here
> is the result after entering the following command line:zzz 
> zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
> news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
>
> inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM
> level 1000
> (if any)
> dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
> zzz-laptop:~/lm$ Where we made a mistake? I see the
> xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
> output file?Best regards!
>
>
> I was having the same problem when following the steps in the baseline
> instructions but I was able to get it to work by adding "yes"
> after --text.
>
> Try this:
>
> ~/moses/irstlm/bin/compile-lm --text yes
> news-commentary-v8.fr-en.lm.en.gz
> news-commentary-v8.fr-en.arpa.en
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
>  >
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> Call
> Send SMS
> Add to Skype
> You'll need Skype CreditFree via Skype
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> h

Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Janez Kadivec
Hi,

thank you for your help. The added "yes" parameter resolved the situation.
We are following the Moses baseline, published in the official Moses web
site:
http://www.statmt.org/moses/?n=moses.baseline

Please correct the last command in the Language Model Traning section. The
command is marked with red color.

mkdir ~/lm
 cd ~/lm
 ~/irstlm/bin/add-start-end.sh \
   < ~/corpus/news-commentary-v8.fr-en.true.en \
   > news-commentary-v8.fr-en.sb.en
 export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
   -i news-commentary-v8.fr-en.sb.en  \
   -t ./tmp  -p -s improved-kneser-ney -o
news-commentary-v8.fr-en.lm.en* ~/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz \
   news-commentary-v8.fr-en.arpa.en*



We followed the same baseline. In the Training the Translation System we
found the next inconsistency:
We installed the Moses and part of it was also Giza++ installed under the
...mosesdecoder\giza++1.0.7.
We executed the following commands from the Training the Translation System
section:

mkdir ~/working
 cd ~/working
 nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train \
 -corpus ~/corpus/news-commentary-v8.fr-en.clean \
 -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe \
 -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8  \
 -external-bin-dir ~/mosesdecoder/tools >& training.out &

During the execution of the last command (marked with green color) here was
an error saying that the mkcls utility is not found.
It's not found because in the "initial" installation there is no tools
subdirectory. We renamed the Giza++107 directory to "tools".
The command was executed instantly with no results. So we removed the last
character "&" from the command. It's working now for about half an hour. ;)

In my opinion there are two things that need to be somehow described or
corrected in the Moses baseline:
1. Notify the user about the location of the Giza++ utilities
(mosesdecoder/tools or mosesdecoder/giza++) and need to rename the folders
to the one used in command.
2. Remove the last "&" char in the command, listed in the baseline.

Have a nice rest of the day.
Janez


 Seth syggested you the right fix
>
> I just checked the IRSTLM documentation
>
> http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
> and the correct notation is reported there.
>
> Could you please tell me from where do you get the "wrong" information
> So that I correct it.
>
>
> Nicola
> (on behalf of IRSTLM development team)
>
>
>
> On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:
>
> First four commands were executed successfuly. The last one failed. Here
> is the result after entering the following command line:zzz 
> zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
> news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
>
> inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM level 1000
> (if any)
> dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
> zzz-laptop:~/lm$ Where we made a mistake? I see the
> xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
> output file?Best regards!
>
>
> I was having the same problem when following the steps in the baseline
> instructions but I was able to get it to work by adding "yes" after --text.
>
> Try this:
>
> ~/moses/irstlm/bin/compile-lm --text yes news-commentary-v8.fr-en.lm.en.gz
> news-commentary-v8.fr-en.arpa.en
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
Call
Send SMS
Add to Skype
You'll need Skype CreditFree via Skype
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Barry Haddow
Hi Nicola

When I tried with irstlm 5.80.03, the version mentioned on the Moses 
baseline page (http://www.statmt.org/moses/?n=Moses.Baseline), it did 
not like the "yes". Has there been a change in irstlm? I can check again.

There has been some history with this argument. You can see in the wiki 
history of the Moses baseline page that the "yes" was added (because 
some users reported problems) then removed (because other users reported 
problems). Clarification of what works with what version of irstlm would 
be very useful,

cheers - Barry

On 05/03/14 08:51, Nicola Bertoldi wrote:
> Hi  Janez,
>
> Seth syggested you the right fix
>
> I just checked the IRSTLM documentation
> http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
> and the correct notation is reported there.
>
> Could you please tell me from where do you get the "wrong" information
> So that I correct it.
>
>
> Nicola
> (on behalf of IRSTLM development team)
>
>
>
> On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:
>
> First four commands were executed successfuly. The last one failed. Here
> is the result after entering the following command line:zzz 
> zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
> news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
>
> inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM level 1000
> (if any)
> dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
> zzz-laptop:~/lm$ Where we made a mistake? I see the
> xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
> output file?Best regards!
>
>
> I was having the same problem when following the steps in the baseline
> instructions but I was able to get it to work by adding "yes" after --text.
>
> Try this:
>
> ~/moses/irstlm/bin/compile-lm --text yes news-commentary-v8.fr-en.lm.en.gz
> news-commentary-v8.fr-en.arpa.en
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Seth Jarrett
> I just checked the IRSTLM documentation
>
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
> and the correct notation is reported there.
> 
> Could you please tell me from where do you get the "wrong" information
> So that I correct it.
> 
> Nicola
> (on behalf of IRSTLM development team)


Hi Nicola,

The problem is in the instructions on the Moses/Baseline page in the IRSTLM
section:
http://www.statmt.org/moses/?n=Moses.Baseline#irstlm

I found the correct notation using "compile-lm --help" so it looks like only
the baseline page has this problem. But it's pretty important for beginners!

Seth

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Nicola Bertoldi
Hi  Janez,

Seth syggested you the right fix

I just checked the IRSTLM documentation
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
and the correct notation is reported there.

Could you please tell me from where do you get the "wrong" information
So that I correct it.


Nicola
(on behalf of IRSTLM development team)



On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:

First four commands were executed successfuly. The last one failed. Here
is the result after entering the following command line:zzz 
zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en

inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM level 1000
(if any)
dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
zzz-laptop:~/lm$ Where we made a mistake? I see the
xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
output file?Best regards!


I was having the same problem when following the steps in the baseline
instructions but I was able to get it to work by adding "yes" after --text.

Try this:

~/moses/irstlm/bin/compile-lm --text yes news-commentary-v8.fr-en.lm.en.gz
news-commentary-v8.fr-en.arpa.en


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-04 Thread Seth Jarrett
> First four commands were executed successfuly. The last one failed. Here
is the result after entering the following command line:zzz 
zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
> 
> inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM level 1000
(if any)
> dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz 
zzz-laptop:~/lm$ Where we made a mistake? I see the
xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
output file?Best regards!
> 

I was having the same problem when following the steps in the baseline
instructions but I was able to get it to work by adding "yes" after --text.

Try this:

~/moses/irstlm/bin/compile-lm --text yes news-commentary-v8.fr-en.lm.en.gz
news-commentary-v8.fr-en.arpa.en


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language model training

2013-06-30 Thread Hieu Hoang
maybe you should run the Moses wrapper script
  scripts/generic/trainlm-irst2.perl
which executes the irstlm script for you

On 29 June 2013 14:32, Mehndi Bhargava  wrote:

> when i run the following command:
> ~/irstlm/bin/add-start-end.sh < ~/corpus/news-commentary-v7.fr-en.true.en
> > news-commentary-v7.fr-en.sb.en export IRSTLM=$HOME/irstlm;
> ~/irstlm/bin/build-lm.sh -i news-commentary-v7.fr-en.sb.en -t ./tmp -p \ -s
> improved-kneser-ney -o news-commentary-v7.fr-en.lm.en
> ~/irstlm/bin/compile-lm --text yes news-commentary-v7.fr-en.lm.en.gz
> news-commentary-v7.fr-en.arpa.en
> it displays
>
> inpfile: news-commentary-v7.fr-en.lm.en.gz
> loading up to the LM level 1000 (if any)
> dub: 1000
> Failed to open news-commentary-v7.fr-en.lm.en.gz!
>
> what do you think is the problem?
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language model training- reg

2013-03-12 Thread Kenneth Heafield
It looks like you have too little data to build a language model.  If 
you continue to have the problem after using more data, please post the 
command you ran and the output.  There are at least four different ways 
to build a language model described in 
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel so 
it's hard to debug without knowing which one you used.

Kenneth

On 03/12/13 13:35, Nikhila Achukatla wrote:
> Hi,
>I am here by attaching a file. I'm trying to work on that data.
> Upto truecasing step, it worked correctly.
> But in language model training step, language model file is not created.
> When I tried to work with the data provided by moses website, it
> worked correctly.
> Can you please tell me, why lm file is not created with the data I used.
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model

2008-11-20 Thread Marcello Federico
see references in
http://www.speech.sri.com/projects/srilm/

M.

__
From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Michael Zuckerman [EMAIL 
PROTECTED]
Sent: Thursday, November 20, 2008 1:38 PM
To: John Burger; [EMAIL PROTECTED]
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] Language model

Thank you very much for your answer.
With regard to this I have a few more questions:
1) How is the conditional probability of an n-gram is calculated ?
2) If some n-gram is not present in the language model, does it mean that its 
conditional probability is 0 ?
3) What are backoff weights ?

Thanks,
Michael.

On Tue, Nov 18, 2008 at 6:08 PM, John Burger <[EMAIL PROTECTED]<mailto:[EMAIL 
PROTECTED]>> wrote:
Michael Zuckerman wrote:

Could you please explain about the format of .lm file generated by the script 
ngram-count.

http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html

- John D. Burger
 MITRE



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model

2008-11-20 Thread Michael Zuckerman
Thank you very much for your answer.
With regard to this I have a few more questions:
1) How is the conditional probability of an n-gram is calculated ?
2) If some n-gram is not present in the language model, does it mean that
its conditional probability is 0 ?
3) What are backoff weights ?

Thanks,
Michael.

On Tue, Nov 18, 2008 at 6:08 PM, John Burger <[EMAIL PROTECTED]> wrote:

> Michael Zuckerman wrote:
>
>  Could you please explain about the format of .lm file generated by the
>> script ngram-count.
>>
>
> http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html
>
> - John D. Burger
>  MITRE
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model

2008-11-18 Thread John Burger
Michael Zuckerman wrote:

> Could you please explain about the format of .lm file generated by  
> the script ngram-count.

http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html

- John D. Burger
   MITRE

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model

2008-11-18 Thread Alexandre Allauzen
Hi,
here is a description of the ARPA format used for language model :

http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html

Michael Zuckerman wrote:
> Hi,
>
> Could you please explain about the format of .lm file generated by the 
> script ngram-count. For example, I got .lm file that starts with:
>
> \data\
> ngram 1=76288
> ngram 2=1644644
> ngram 3=1410926
> ngram 4=1393383
> ngram 5=1071864
>
> \1-grams:
> -2.815075   !   -1.648233
> -3.10526"   -0.4596801
> -6.09184#   -0.1521228
> -4.628769   $   -0.2417951
> -3.474399   %   -0.7403963
> -4.398747   &   -0.7879647
> -2.462822   '   -0.6111439
>
> If I understand correctly "ngram 1=76288" means that there are 76288 
> ngrams containing one token (word), and so on.
> But what do the negative numbers before and after the tokens mean ? 
> Also I noticed that sometimes the numbers after the tokens are 
> missing. What does it mean ?
>
> Thank you very much,
>  Michael.
> 
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>   


-- 
 Alexandre Allauzen
 Univ. Paris XI, LIMSI-CNRS
Tel : 01.69.85.80.64 (80.88)
Bur : 114 LIMSI Bat. 508
 [EMAIL PROTECTED]

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model in Moses under VS 05

2007-12-10 Thread Hieu Hoang
you are again correct, binary LM is not supported under windows. it is
usually handled by IRSTLM, which doesn't compile with VS05. u can extend
IRSTLM to support windows, or write your own LM.
 
all parts of the decoder is cross-platform, including the binary phrase
table. 
 
however, we don't have control over external libraries like IRSTLM &
SRILM. The training toolkit is unix-based because it is mostly
legacy/external code.
 
Hieu Hoang
www.hoang.co.uk/hieu

-Original Message-
From: Jie Wu [mailto:[EMAIL PROTECTED] 
Sent: 10 December 2007 16:33
To: Hieu Hoang
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] Language Model in Moses under VS 05


Hi, Hieu,

Thanks for the reply. But I found out that if Moses of Windows version
only supports the internal language model, that implies that loading
language models in binary format is not supported. I am kind of
confused, loading binary Language Models is handled by functions in
which toolkit, SRILM, IRSTLM, or Moses. If I would like to have a
Windows version of Moses that can load binary language models which
could handle more than 3grams, what suggestion would you give when I
make modifications? 

How about phrase tables? Does it have the same problems in terms of
being loaded binary, windows vs. unix restrictions, etc?

Thanks in advance

Jie


On Dec 6, 2007 6:19 AM, Hieu Hoang < [EMAIL PROTECTED]> wrote:


hi jie,
 
1. you're correct. the only language model avaiable using vs05 is the
internal LM, which only supports up to 3-gram. u're welcome to change
the code to support more. IRSTLM is not aviable for VS05, and i'm not
happy with the windows version of SRILM because it's difficult to
compile & doesn't support some things we need, like compressed files
etc.
 
2. the generation step is described in
 
http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/emnlp2007-factored.pdf
<http://www.iccs.inf.ed.ac.uk/%7Epkoehn/publications/emnlp2007-factored.
pdf> 
the model can be created using the moses training scripts, described
in 
http://www.statmt.org/moses/?n=FactoredTraining.HomePage

 
 
 
Hieu Hoang
www.hoang.co.uk/hieu

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jie Wu
Sent: 05 December 2007 16:42
To: moses-support@mit.edu
Subject: [Moses-support] Language Model in Moses under VS 05


Hi, 

I have two questions:
1. I am studying Moses and found out that in VS05, Moses uses the
internal language model rather than SRILM. And it turns out that the
internal LM can only handle up to 3-grams. Does it mean in order to
process n-grams with n>3, I have to use the SRILM or IRSTLM? BTW, what
is Joint & Skip LM Model? 

2. What is a generation table? As far as I know, the hypothesis is
associated with a probability score multiplied with the translation,
distortion and language model cost. The hypothesis with a highest
probability score is the best possible translation. Where does a
generation table come into play? and How to generate a generation table?


Thanks in advance
Jie

-- 
= 
Jie Wu 
Homepage: 
http://www.jiewu.info 




-- 
= 
Jie Wu 
Homepage: 
http://www.jiewu.info 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model in Moses under VS 05

2007-12-10 Thread Jie Wu
Hi, Hieu,

Thanks for the reply. But I found out that if Moses of Windows version only
supports the internal language model, that implies that loading language
models in binary format is not supported. I am kind of confused, loading
binary Language Models is handled by functions in which toolkit, SRILM,
IRSTLM, or Moses. If I would like to have a Windows version of Moses that
can load binary language models which could handle more than 3grams, what
suggestion would you give when I make modifications?

How about phrase tables? Does it have the same problems in terms of being
loaded binary, windows vs. unix restrictions, etc?

Thanks in advance

Jie

On Dec 6, 2007 6:19 AM, Hieu Hoang <[EMAIL PROTECTED]> wrote:

>  hi jie,
>
> 1. you're correct. the only language model avaiable using vs05 is the
> internal LM, which only supports up to 3-gram. u're welcome to change the
> code to support more. IRSTLM is not aviable for VS05, and i'm not happy with
> the windows version of SRILM because it's difficult to compile & doesn't
> support some things we need, like compressed files etc.
>
> 2. the generation step is described in
>
> http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/emnlp2007-factored.pdf
> the model can be created using the moses training scripts, described
> in
> http://www.statmt.org/moses/?n=FactoredTraining.HomePage
>
>
>
> Hieu Hoang
> www.hoang.co.uk/hieu
>
>  -Original Message-
> *From:* [EMAIL PROTECTED] [mailto:
> [EMAIL PROTECTED] *On Behalf Of *Jie Wu
> *Sent:* 05 December 2007 16:42
> *To:* moses-support@mit.edu
> *Subject:* [Moses-support] Language Model in Moses under VS 05
>
> Hi,
>
> I have two questions:
> 1. I am studying Moses and found out that in VS05, Moses uses the internal
> language model rather than SRILM. And it turns out that the internal LM can
> only handle up to 3-grams. Does it mean in order to process n-grams with
> n>3, I have to use the SRILM or IRSTLM? BTW, what is Joint & Skip LM Model?
>
> 2. What is a generation table? As far as I know, the hypothesis is
> associated with a probability score multiplied with the translation,
> distortion and language model cost. The hypothesis with a highest
> probability score is the best possible translation. Where does a generation
> table come into play? and How to generate a generation table?
>
> Thanks in advance
> Jie
>
> --
> =
> Jie Wu
> Homepage:
> http://www.jiewu.info
>
>


-- 
=
Jie Wu
Homepage:
http://www.jiewu.info
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model in Moses under VS 05

2007-12-06 Thread Hieu Hoang
hi jie,
 
1. you're correct. the only language model avaiable using vs05 is the
internal LM, which only supports up to 3-gram. u're welcome to change
the code to support more. IRSTLM is not aviable for VS05, and i'm not
happy with the windows version of SRILM because it's difficult to
compile & doesn't support some things we need, like compressed files
etc.
 
2. the generation step is described in
 
http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/emnlp2007-factored.pdf
the model can be created using the moses training scripts, described
in 
http://www.statmt.org/moses/?n=FactoredTraining.HomePage
 
 
 
Hieu Hoang
www.hoang.co.uk/hieu

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jie Wu
Sent: 05 December 2007 16:42
To: moses-support@mit.edu
Subject: [Moses-support] Language Model in Moses under VS 05


Hi, 

I have two questions:
1. I am studying Moses and found out that in VS05, Moses uses the
internal language model rather than SRILM. And it turns out that the
internal LM can only handle up to 3-grams. Does it mean in order to
process n-grams with n>3, I have to use the SRILM or IRSTLM? BTW, what
is Joint & Skip LM Model? 

2. What is a generation table? As far as I know, the hypothesis is
associated with a probability score multiplied with the translation,
distortion and language model cost. The hypothesis with a highest
probability score is the best possible translation. Where does a
generation table come into play? and How to generate a generation table?


Thanks in advance
Jie

-- 
= 
Jie Wu 
Homepage: 
http://www.jiewu.info 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support