Re: [Moses-support] How to score sentences using SRILM

2012-09-06 Thread tharaka weheragoda
ophe.ser...@gmail.com> wrote: > Hi Tharaka, > You may use the perplexity with the switch -ppl with the > tool "ngram" and if you want some details you can add the switch -debug > <1,2...> to have more details. > > Cheers, > > Christophe > > > L

[Moses-support] How to score sentences using SRILM

2012-09-06 Thread tharaka weheragoda
Hi All, I need use SRILM to score different sentences .Is there any scripts available for this or do i need to write my own script ? Can anybody guide me how to do this.. Thanks in advance, Tharaka ___ Moses-support mailing list Moses-support@mit.edu h

Re: [Moses-support] Moses-support Digest, Vol 69, Issue 66

2012-07-26 Thread tharaka weheragoda
Hi everybody, I'm new to moses and i'm doing a english- sinhala (Language used in sri lanka) translation.. But when i used the provided tokenizer to tokenize sinhala datafile i found that it's tokenizing in a wrong way. so i created a nonbreaking_prefix file for sinhala.. But it's not working. I

Re: [Moses-support] tokenizer problem

2012-05-30 Thread tharaka weheragoda
les in > this > > folder all have the same name with a 2-letter language code extension. > These > > file have language-specific rules for how the tokenizer & detokenizer > work. > > > > Anyone, is there a better resource than reading the existing files to > learn > >

Re: [Moses-support] tokenizer problem

2012-05-30 Thread tharaka weheragoda
r & detokenizer > work. > > Anyone, is there a better resource than reading the existing files to > learn how the files work? > > Tom > > > > On Wed, 30 May 2012 18:22:52 +0530, tharaka weheragoda < > tharakadum...@gmail.com> wrote: > > Thank you very

Re: [Moses-support] tokenizer problem

2012-05-30 Thread tharaka weheragoda
; > > On Wed, 30 May 2012 17:37:19 +0530, tharaka weheragoda < > tharakadum...@gmail.com> wrote: > > Hi everybody, > > When i'm trying to tokenize my sinhala dataset it gives me a warning > message like this > "WARNING: No known abbreviations for language &#

[Moses-support] tokenizer problem

2012-05-30 Thread tharaka weheragoda
Hi everybody, When i'm trying to tokenize my sinhala dataset it gives me a warning message like this "WARNING: No known abbreviations for language 'si', attempting fall-back to English version..." And my letters have changed a bit. Is their anyway to tokenize sinhala data with this tokenizer.p

[Moses-support] tuning set

2012-05-14 Thread tharaka weheragoda
Hi, i'm new to this field and i'm confused about the use of tuning set? Actually waht's the purpose of using a tuning set here? Thanks in advance ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] minimum amount of parallel data required for SMT to perform well

2012-05-10 Thread tharaka weheragoda
Hi All, If anybody knows about the minimum amount of parallel data required for SMT to perform well please let me know. Thanks in advance! Tharaka ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-suppor

[Moses-support] giza ++ error

2012-04-21 Thread tharaka weheragoda
Hi, When i train the phrase table i get these errors and terminate execution. nohup: ignoring input Using SCRIPTS_ROOTDIR: /home/tharaka/Desktop/project/tools/moses/scripts Using single-thread GIZA (1) preparing corpus @ Sun Apr 22 00:41:50 IST 2012 Executing: mkdir -p work/corpus (1.0) selecting

[Moses-support] giza ++ error

2012-04-21 Thread tharaka weheragoda
Hi, I tried to train phrase table with giza ++. I got these errors in the training.out file. nohup: ignoring input Using SCRIPTS_ROOTDIR: /home/tharaka/Desktop/project/tools/moses/scripts Using single-thread GIZA (1) preparing corpus @ Sat Apr 21 20:34:43 IST 2012 Executing: mkdir -p work/corpus (