[Moses-support] Negative weights for translation model and reordering models

2015-01-29 Thread HOANG Cong Duy Vu
Hi, I trained the conventional baseline system with translation model, lexical reordering model (wbe-msd-bidirectional-fe-allff), language model. I've encountered the following problems: 1) When I *add* another hierarchical reordering model (hier-mslr-bidirectional-fe-allff, with 8 dense feature

Re: [Moses-support] Sparse features and overfitting

2015-01-15 Thread HOANG Cong Duy Vu
have to remove overlapping before moving on with other kinds of features. -- Cheers, Vu On Fri, Jan 16, 2015 at 6:31 AM, Matthias Huck wrote: > On Thu, 2015-01-15 at 13:54 +0800, HOANG Cong Duy Vu wrote: > > > > - tune & test > > (based on source) > > size of overl

[Moses-support] Sparse features and overfitting

2015-01-14 Thread HOANG Cong Duy Vu
Hi, I am working on applying sparse features for *phrase-based* system on *conversational *domain (e.g. SMS, Chat). I used sparse features such as: TargetWordInsertionFeature, SourceWordDeletionFeature, WordTranslationFeature, PhraseLengthFeature. Sparse features are used only for top source and

Re: [Moses-support] Using evaluation metrics other than BLEU in tuning

2015-01-05 Thread HOANG Cong Duy Vu
Hi, You can use other metrics by using ZMERT together with Moses: *ZMERT toolkit*: http://cs.jhu.edu/~ozaidan/zmert/ -- Cheers, Vu On Tue, Jan 6, 2015 at 12:35 PM, Rajnath Patel wrote: > Hi All, > As we know, MOSES uses BLEU for evaluation in tuning process . We want to > use evaluation metri

Re: [Moses-support] string of Words + states in feature functions

2014-12-10 Thread HOANG Cong Duy Vu
More: word_str = source_sent.GetWord(pos).GetString(m_factorType) -- Cheers, Vu On Wed, Dec 10, 2014 at 5:26 PM, HOANG Cong Duy Vu wrote: > Hi Amir, > > I'm implementing a feature function in moses-chart. I need the source >> words string and also their indexes in the s

Re: [Moses-support] string of Words + states in feature functions

2014-12-10 Thread HOANG Cong Duy Vu
Hi Amir, I'm implementing a feature function in moses-chart. I need the source words > string and also their indexes in the source sentence. I've written a > function that gets the source words but I don't know how extract word > string from a word. > could anyone guide me how to do that? as I kno

Re: [Moses-support] Add a new LM feature in Moses

2014-08-14 Thread HOANG Cong Duy Vu
; > > Regards, > > Christian Hadiwinoto > > > > *From:* moses-support-boun...@mit.edu [mailto: > moses-support-boun...@mit.edu] *On Behalf Of *HOANG Cong Duy Vu > *Sent:* Friday, August 15, 2014 10:44 AM > *To:* moses-support@mit.edu > *Subject:* [Moses-support] Add a n

[Moses-support] Add a new LM feature in Moses

2014-08-14 Thread HOANG Cong Duy Vu
Hi, I would like to add a new simple LM named HybLanguageModelKen (HybKen.h and HybKen.cpp) which will inherit from LanguageModelKen. In Factory.cpp, I added as follows: ... //#include "moses/LM/Ken.h" #include "moses/LM/HybKen.h" ... class KenFactory : public FeatureFactory { public: void Cr

Re: [Moses-support] Creating Language Model from google 1gram file

2013-01-24 Thread HOANG Cong Duy Vu
Hi, I guess you can run as follows: build-sublm.pl --size --ngrams --sublm [--prune-singletons] [--kneser-ney|--witten-bell] merge-sublm.pl --size --sublm -lm iARPA_LM.gz (then with ARPA files you can use KenLM to build binary LM files) -- Cheers, Vu On Thu, Jan 24, 2013 at 6:14 AM, Pele

[Moses-support] Google Web1T 5-gram

2012-12-05 Thread HOANG Cong Duy Vu
Hi everyone, I would like to build large LMs from the Google Web1T 5-gram . I tried to use the goograms2ngrams.pl script from IRSTLM toolkit to extract raw n-gram counts but don't know how to build LMs (e.g. arpa file) from th

Re: [Moses-support] recaser error

2012-09-19 Thread HOANG Cong Duy Vu
Hi, You need to train the recaser first, then recase, something like this: #train recaser ~/smt_tools/moses/scripts/recaser/train-recaser.perl -train-script ~/smt_tools/moses/scripts/training/train-model.perl -ngram-count ~/smt_tools/srilm/bin/i686-m64/ngram-count -corpus corpus/viva-phase1-final

Re: [Moses-support] minimum amount of parallel data required for SMT to perform well

2012-05-10 Thread HOANG Cong Duy Vu
Hi, You may consider reading this paper ( http://aclweb.org/anthology-new/E/E12/E12-1016.pdf) to figure out the answer for your question. -- Cheers, Vu On Fri, May 11, 2012 at 9:00 AM, Wang Pidong wrote: > In my opinion, that depends on the differences between the source language > and the ta

[Moses-support] Incremental training for SMT

2011-10-05 Thread HOANG Cong Duy Vu
Hi all, I am working on the problem that tries to develop a SMT system that can learn incrementally. The scenario is as follows: - A state-of-the-art SMT system tries to translate a source language sentence from users. - Users identify some translation errors in translated sentence and then give