Re: [Moses-support] train-truecaser.perl proposed tweak

2010-10-25 Thread Philipp Koehn
Hi, Sounds reasonable to me, but it would be good to have this as an option, as Miles suggested. -phi On 25 Oct 2010 17:40, "Ben Gottesman" wrote: > Hi, > > Are truecase models still widely in use? > > I have a proposal for a tweak to the train-truecaser.perl script. > > Currently, we don't tak

Re: [Moses-support] train-truecaser.perl proposed tweak

2010-10-25 Thread Miles Osborne
this sounds risky to me. it would be better to allow the user to specify the behaviour; for your suggestions, you would add an extra flag which would enable this. the default would be for truecasing to operate as it used to. Miles On 25 October 2010 17:37, Ben Gottesman wrote: > Hi, > > Are t

[Moses-support] train-truecaser.perl proposed tweak

2010-10-25 Thread Ben Gottesman
Hi, Are truecase models still widely in use? I have a proposal for a tweak to the train-truecaser.perl script. Currently, we don't take the first token of a sentence as evidence for the true casing of that type, on the basis that the first word of a sentence is always capitalized. The first tok

Re: [Moses-support] bag of words language model

2010-10-25 Thread Philipp Koehn
Hi, I added the training script and some documentation: http://www.statmt.org/mosesdev/?n=Moses.AdvancedFeatures#ntoc25 Let me know, if this actually works. -phi On Mon, Oct 25, 2010 at 1:15 PM, Ondrej Bojar wrote: > Hi, Philipp, > > I was wondering what that secret model was... Is there any b

[Moses-support] Moses use by translation industry

2010-10-25 Thread Philipp Koehn
Hi, not a bug, but a feature: TDA Members doing business with Moses The translation industry is steadily appropriating the Moses translation engine, an open source system available as a kit on the web. At the TAUS User Conference 2010 in Portland (Oregon) TDA members from major corporations and

Re: [Moses-support] bag of words language model

2010-10-25 Thread Ondrej Bojar
Hi, Philipp, I was wondering what that secret model was... Is there any brief documentation of what the Moses code expects to load for this model? The training of this discriminative word lexicon can be heavily parallelized. Is there any such implementation available, despite not being efficient

Re: [Moses-support] bag of words language model

2010-10-25 Thread Miles Osborne
i implemented this years ago (the idea then was to see if for free-word-order languages, phrases could be generalised). at the time it didn't seem that there was a more efficient way to do it than just generate permutations and score them. and if you think about it, this is essentially the reorde

Re: [Moses-support] bag of words language model

2010-10-25 Thread Philipp Koehn
Hi, I am not familiar with that, but somewhat related is Arne Mauser's global lexical model, which also exists as a secret feature in Moses (secret because no effiencient training exists): Citation: A. Mauser, S. Hasan, and H. Ney. Extending Statistical Machine Translation with Discriminative and

Re: [Moses-support] NULL token

2010-10-25 Thread Philipp Koehn
Hi, the NULL token is an implicit concept of word alignment (and it not placed at any specific position). You can see it popping up in the the lexical translation tables, but otherwise it is invisible. -phi On Mon, Oct 25, 2010 at 11:45 AM, Somayeh Bakhshaei wrote: > Hello, > > In the theory we

[Moses-support] NULL token

2010-10-25 Thread Somayeh Bakhshaei
Hello, In the theory we learn that the null token is place in the beginning of each sentence. But in the output file of a real system it is seem there is not such a token implicitly. -- Best Regards, S.Bakhshaei ___ Moses-sup