If you move the count field to the beginning of the line, you can use the 
-text-has-weights switch of ngram-counts:

> -text-has-weights
>     Treat the first field in each text input line as a weight factor by which 
> the N-gram counts for that line are to be multiplied.

More here:

  http://www.speech.sri.com/projects/srilm/manpages/ngram-count.1.html

There's also a SRILM mailing list if you need more help:

  http://www.speech.sri.com/mailman/listinfo/srilm-user/

- John Burger
  MITRE

On Jan 23, 2013, at 17:14 , Peled Guy wrote:

> Hi,
> 
> I'm working on a Transliteration project.
> The input is a word in one language and the output is the same word in 
> English (not translated).
> My language Model will created from google 1gram file - while each letter of 
> a word should be a word.
> This is the original file:
> 
> </S>    95119665584
> <S>     95119665584
> ,       30578667846
> .       22077031422
> <UNK>   21594821357
> the     19401194714
> -       16337125274
> of      12765289150
> and     12522922536
> 
> This is the file after inserting spaces between words letters:
> 
> t h e     19401194714
> -       16337125274
> o f      12765289150
> a n d     12522922536
> 
> Now I have "1gram" file that contains not just 1gram (1 word each line), but 
> also 2grams\3grams\etc.
> How can I run the SRILM "ngram-count" script to create a Language Model ?
> When I'm running the script normally , the integers are calculated as words 
> too - and not as Probability\number of appearances.
> 
> Can anyone help me please?
> 
> Thank you,
> Guy.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to