Re: [Moses-support] About Bilingual LM in Moses

2019-04-15 Thread Rico Sennrich
Hello Ergun, we've had the 'nan' issue reported before ( see https://moses-support.mit.narkive.com/hs8LwsnT/blingual-neural-lm-log-likelihood-nan https://moses-support.mit.narkive.com/fklzlBiW/bilingual-lm-nan-nan-nan ). You can follow Nick's recommendation of lowering the learning rate, or try

[Moses-support] Post-doctoral Researcher Position at the University of Zurich

2019-03-08 Thread Rico Sennrich
arch statement (up to two pages) * a CV, including a list of publications * two references (name, position and e-mail address) The application deadline is *31 March 2019*. For informal inquiries, please also contact sennrich AT cl.uzh.ch <mailto:sennrich%20AT%20cl.uzh.ch>. -- Rico Sennri

[Moses-support] Fully Funded Four-Year PhD Studentships at University of Edinburgh

2019-03-05 Thread Rico Sennrich
(though we may be able to consider late applications). Please direct inquiries to the PhD admissions team atcdt-nlp-admissi...@inf.ed.ac.uk. -- Rico Sennrich School of Informatics University of Edinburgh 10 Crichton Street Edinburgh, EH8 9AB, United Kingdom http://homepages.inf.ed.ac.uk/rsennric

[Moses-support] Funded PhD position at University of Edinburgh on spoken language translation (UK/EU)

2018-11-16 Thread Rico Sennrich
ls We will start selecting candidates in December 2018, so we recommend applying by 30/11/2018. Please contact Barry Haddow (bhaddow at inf.ed.ac.uk) or Rico Sennrich (rico.sennrich at inf.ed.ac.uk) for more information. The University of Edinburgh is a charitable body, registered in Sc

Re: [Moses-support] NPLM and GPU

2018-07-26 Thread Rico Sennrich
Hello Claudia, we got this to work by compiling NPLM against eigen-magma, which has GPU support: https://github.com/bravegag/eigen-magma I'm afraid eigen-magma isn't being maintained though, so I don't know if this still works. best wishes, Rico On 26/07/18 18:59, Claudia Matos Veliz

Re: [Moses-support] Web Translation Not working

2018-03-13 Thread Rico Sennrich
Hello Adeeb, http://demo.statmt.org is now based on neural machine translation technology, and does not yet support translating a webpage. The moses demo, which still has support for webpage translation, is currently still available at http://demo2.statmt.org/ However, this demo may be

[Moses-support] PhD studentships at the University of Edinburgh

2018-02-06 Thread Rico Sennrich
PhD Studentships in Computational Linguistics, Speech Technology and Cognitive Science The Institute for Language, Cognition and Computation (ILCC) at the University of Edinburgh invites applications for three-year PhD studentships starting in September 2018. ILCC is dedicated to the pursuit of

[Moses-support] PhD studentships at the University of Edinburgh

2017-11-09 Thread Rico Sennrich
PHD STUDENTSHIPS IN COMPUTATIONAL LINGUISTICS, SPEECH TECHNOLOGY AND COGNITIVE SCIENCE Institute for Language, Cognition and Computation School of Informatics University of Edinburgh The Institute for Language, Cognition and Computation (ILCC) at the University of Edinburgh invites

Re: [Moses-support] MWE feature function in Moses

2017-11-01 Thread Rico Sennrich
Hello Mukund, the phrase table supports an arbitrary number of features - if you have a way to count the number of MWEs in the source phrase (this is language-specific and not part of Moses), you can just modify the phrase table to add this count as an additional feature on each line. You can

Re: [Moses-support] Convert parallel text files to sgm format

2017-05-18 Thread Rico Sennrich
Hi Roee, I know people have written scripts like this in the past, but I don't know of any that is public. I also have a variant of multi-bleu.perl that takes in plain text, but re-uses the internal tokenization of mteval-v13a.pl. As a consequence, you can pass it untokenized references and

Re: [Moses-support] How to compile RDLM

2017-04-25 Thread Rico Sennrich
in On Tue, Apr 25, 2017 at 2:25 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote: Hello Xin, you should compile moses with the option "--with-nplm=". best wishes, Rico On 25/04/17 13:14, dai xin wrote: Hi all, I did

Re: [Moses-support] How to compile RDLM

2017-04-25 Thread Rico Sennrich
Hello Xin, you should compile moses with the option "--with-nplm=NPLM toolkit>". best wishes, Rico On 25/04/17 13:14, dai xin wrote: Hi all, I did a training with RD and NP language model. When I tried to decode with ~/mosesdecoder/bin/moses -f /path/to/moses.ini the error message came

Re: [Moses-support] Did anyone tried Edinburgh English-German syntax system for WMT 2015?

2017-04-24 Thread Rico Sennrich
parser. But I have no idea why there is no file in this directory. RDLM:train-custom-syntax crashed. Is the problem could be lack of any libraries, or the version of Moses? Thanks in advance and hoping for reply. Best regards, Xin On Mon, Apr 24, 2017 at 9:53 AM, Rico Sennrich <rico.se

Re: [Moses-support] Did anyone tried Edinburgh English-German syntax system for WMT 2015?

2017-04-24 Thread Rico Sennrich
Hello Xin, what is the error message for the filtering step? Look at the STDERR and STDOUT files that are being produced in the step to see if there's an error message in there. best wishes, Rico On 20/04/17 12:41, dai xin wrote: Hi, Did anyone have experience of Edinburgh English-German

Re: [Moses-support] German compound splitter

2017-02-01 Thread Rico Sennrich
r To:"moses-support@mit.edu" <moses-support@mit.edu> Thank you, Rico! Looks promising. I found this one on Python's Pypi repository:https://pypi.python.org/pypi/SoMaJo/1.1.2 Does anyone have any experience with it? Tom On 8/25/2016 11:01 PM,moses-support-requ...@mit.

Re: [Moses-support] RNN-based features in Moses

2016-12-08 Thread Rico Sennrich
Hello Hakimeh, the branch https://github.com/moses-smt/mosesdecoder/tree/nmt-hybrid supports NMT (which is basically an RNN conditioned on the source text) as a feature function in Moses. It is described in this paper: @inproceedings{junczysdowmunt-dwojak-sennrich:2016:WMT, address =

Re: [Moses-support] Ensemble of Neural Machine Translation systems

2016-11-09 Thread Rico Sennrich
Hi Nat, The reason for averaging at every time step (rather than doing k-best list reranking on the sentence level) is the same reason why we integrate new feature functions in Moses instead of just reranking the k-best output: you make more search errors if you do your initial search with a

Re: [Moses-support] Ensemble of Neural Machine Translation systems

2016-11-03 Thread Rico Sennrich
Hello Nat, for NMT ensembles, you just average the probability distribution of different models at each time step before selecting the next hypothesis (or hypotheses in beam search). If you're familiar with Moses, this is similar to what happens when we combine different feature functions in

Re: [Moses-support] German compound splitter

2016-08-24 Thread Rico Sennrich
Hi Tom, I've been using this one for the Edinburgh WMT submission (EN-DE syntax-based) in the last 3 years: https://github.com/rsennrich/wmt2014-scripts/blob/master/hybrid_compound_splitter.py It implements the hybrid (frequency-based and FST-based) algorithm by Fritzinger & Fraser 2010:

Re: [Moses-support] TreeTagger and format with pipes for Factored Model in moses

2016-07-18 Thread Rico Sennrich
Hi Floran, if you have one file with words, and one file with POS, you can combine the two with the combine_factors.pl script in mosesdecoder/scripts/training. best wishes, Rico On 18.07.2016 10:44, Gmehlin Floran wrote: Hi, I would like to try a Factored Training on my corpus. I see that

Re: [Moses-support] Usage of query command with RDLM

2016-07-09 Thread Rico Sennrich
04:42, Madori Ikeda wrote: > Rico Sennrich <rico.sennrich@...> writes: > >> >> Hello Madori, >>the query command is specific to n-gram LMs in the ARPA format (or >>a compiled format of KenLM). >>Here is how you can me

Re: [Moses-support] Usage of query command with RDLM

2016-07-07 Thread Rico Sennrich
Hello Madori, the query command is specific to n-gram LMs in the ARPA format (or a compiled format of KenLM). Here is how you can measure log probabilities with RDLM (or NPLM in general): 1. extract the n-grams (for NPLM) or syntactic n-grams (for RDLM) from the test set, with the same

Re: [Moses-support] Mosesserver terminates with "girerr::error"

2016-04-01 Thread Rico Sennrich
Hi Mathias, you're passing a boolean as the value of 'word-align', but apparently, the current version of moses server requires the value to be a string. I don't know why this was changed... best wishes, Rico On 22.03.2016 09:07, Mathias Müller wrote: Dear list Since I got recent

Re: [Moses-support] Polysynthetic languages?

2016-02-01 Thread Rico Sennrich
Hi Mike, here's a link to the tool Marcin mentioned: https://github.com/rsennrich/subword-nmt I haven't tried it on phrase-based MT myself, but feel free to give it a try. You could also try other unsupervised morpheme segmenters like morfessor: https://github.com/aalto-speech/morfessor

Re: [Moses-support] different versions of moses yielding different translations

2015-11-26 Thread Rico Sennrich
Hi list, Marcin, I've added a regtest that covers multimodel with compact phrase tables (phrase.multimodel-compactptable), and I've identified the offending commit with git bisect to be commit a804894378b2695bde78bdbff10e9d0f0afb7cc7. @Marcin: do you have an idea what could have caused the

Re: [Moses-support] baseline-system has very low BLEU-Score

2015-11-18 Thread Rico Sennrich
ore of this test set and the translation > of the test set with multi-bleu.perl I get the poor result of 3.76. > > Do you have any tips on how to find my mistake? > > Thanks a lot, > Raphael > > > > > > > Am 18.11.2015 um 14:56 schrieb Rico Sennrich: >> Hell

Re: [Moses-support] baseline-system has very low BLEU-Score

2015-11-18 Thread Rico Sennrich
Hello Raphael, I suggest that you check if you mixed up the languages somewhere, and check if your translation output is actually English. 3.76 BLEU is possible to achieve without translation (because names and some function words are the same between English and German), and it's possible

Re: [Moses-support] NPLM with Europarl ?

2015-10-22 Thread Rico Sennrich
Hi Vincent, here's some results on WMT data (not just Europarl): +0.6 BLEU for English->German (https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/download/510/120) +/-0 for French->English, +0.7 for English->French, +0.4 for Finnish->English, +0.4 for English->Finnish

Re: [Moses-support] word aligner with model dump

2015-10-11 Thread Rico Sennrich
Hello Loic, mgiza (https://github.com/moses-smt/mgiza) supports models dumps, and you can have a look at the script 'force-align-moses.sh' to see how to apply an existing model to new text. best wishes, Rico On 07/10/15 16:38, DUGAST, LOIC wrote: Hello Would any of you know which of the

Re: [Moses-support] (no subject)

2015-10-05 Thread Rico Sennrich
wish to know, why score is higher for dummy 5-KENLM than 3-KENLM. On Mon, Oct 5, 2015 at 7:07 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote: Hi Sanjanasri, 1) your corpus is very small, and you may have to use more iterations of N

Re: [Moses-support] (no subject)

2015-10-05 Thread Rico Sennrich
, Sanjanashree Palanivel <sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote: Dear Rico, Thanks a lot for your excellent guidance. On Sat, Sep 19, 2015 at 9:10 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-21 Thread Rico Sennrich
Hi all, Small correction: --dropout isn't on Github (yet). I never got gains from it, and thus didn't commit. I'll have to double-check my implementation. --input_dropout also didn't give me any gains, but could make training more stable (helping against nan), and is helpful if you want to get

Re: [Moses-support] (no subject)

2015-09-19 Thread Rico Sennrich
<sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote: Dear Rico, Thanks a lot. Will do the necessary changes On Thu, Sep 17, 2015 at 1:54 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote: Hi San

Re: [Moses-support] (no subject)

2015-09-17 Thread Rico Sennrich
del LM1= 0.5 Then i did testing. and end up with the error On Tue, Sep 15, 2015 at 8:43 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>> wrote: Hi Sanjanasri, this error occurs when Moses was compiled without the option '--with-n

Re: [Moses-support] (no subject)

2015-09-15 Thread Rico Sennrich
alanivel <sanjanash...@gmail.com <mailto:sanjanash...@gmail.com>> wrote: Thank you for your earnest response. I will update moses and I will try On Tue, Sep 15, 2015 at 4:22 PM, Rico Sennrich <rico.sennr...@gmx.ch <mailto:rico.sennr...@gmx.ch>>

Re: [Moses-support] (no subject)

2015-09-15 Thread Rico Sennrich
Hello Sanjanasri, this looks like a version mismatch between Moses and NPLM. Specifically, you're using an older Moses commit that is only compatible with nplm 0.2 (or specifically, Kenneth's fork at https://github.com/kpu/nplm ). If you use the latest Moses version from

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Rico Sennrich
Hello Raj, Usually, nplm is used in addition to a back-off LM for best results. That being said, your results indicate that nplm is performing poorly. If you have little training data, a smaller vocabulary size and more training epochs may be appropriate. I would advise to provide a

Re: [Moses-support] Failure to Open Output when using Chart Decoder

2015-09-01 Thread Rico Sennrich
Hello Shyam, this is probably not a bug in the code (this is a check in std::ostream), but a problem with the location you're trying to write to. Can you double-check if your path to the n-best-list is correct, and that you can write to it? best wishes, Rico On 01.09.2015 00:36, Shyam

Re: [Moses-support] Domain adaptation

2015-08-14 Thread Rico Sennrich
Hi Vincent, this section describes some domain adaptation methods that are implemented in Moses: http://www.statmt.org/moses/?n=Advanced.Domain It is incomplete (focusing on parallel data and the translation model), and does not recommend best practices. In general, my recommendation is to

Re: [Moses-support] Normalization of string-to-tree rules

2015-08-13 Thread Rico Sennrich
Hi Fabienne, there are three different implementations for GHKM extraction p(LHS,RHS_t|RHS_s,target_nonterminals) (default) p(RHS_t|RHS_s,LHS) (-alt-direct-rule-score-1) p(LHS,RHS_t|RHS_s) (-alt-direct-rule-score-2) by default, your rules 1 and 2 are not competing, because normalization is

Re: [Moses-support] EMS results - makes sense ?

2015-08-10 Thread Rico Sennrich
Hi Vincent, the KIT paper reports scores on newstest2010 (and newstest2009) in their system description paper, while the matrix shows scores on newstest2011. The UEDIN WMT14 paper reports scores on newstest2012, newstest2013, and newstest2014 (it may admittedly be hard to see which is which:

Re: [Moses-support] Character alignment

2015-07-30 Thread Rico Sennrich
Hello Fatma, Moses has been used for character-level translation, but the alignment is not done by moses itself, but by external tools like (M)GIZA++ or fast-align. Jörg Tiedemann has some suggestions on how to represent your text to get good character alignments with GIZA++:

Re: [Moses-support] Low NCE log-likelihood on Bilingual Neural LM

2015-07-21 Thread Rico Sennrich
Hello Jian, NPLM reports the log-likelihood of the whole training set, and the number is plausible. assuming you have a minibatch size of 1000, your training set perplexity is exp(1.38122e+08/52853/1000)=13.64 you probably want to measure perplexity on a held-out development set though,

Re: [Moses-support] Moses build failed

2015-07-15 Thread Rico Sennrich
Hello Manuela, have you installed all the dependencies listed here for Cygwin installation? http://www.statmt.org/moses/?n=Development.GetStarted it seems like you're missing a system-wide installation of boost. mosesdecoder/bjam currently isn't set-up to use the boost-jam of a local boost

Re: [Moses-support] tuning with BLEU-1

2015-07-08 Thread Rico Sennrich
Researcher New York University, Abu Dhabi http://www.hoang.co.uk/hieu On 8 July 2015 at 13:49, Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr...@gmx.ch wrote: Hi Hieu, kBleuNgramOrder=4 is hard-coded in mosesdecoder/mert/BleuScorer.h . Dunno if you wanna expose the option

Re: [Moses-support] tuning with BLEU-1

2015-07-08 Thread Rico Sennrich
Hi Hieu, kBleuNgramOrder=4 is hard-coded in mosesdecoder/mert/BleuScorer.h . Dunno if you wanna expose the option to the command line, or just hack it locally. best wishes, Rico On 08.07.2015 10:19, Hieu Hoang wrote: does anyone know how I can tune with BLEU-1 or 2 instead of the default

Re: [Moses-support] NPLM and BilingualNPLM not working as expected in Moses

2015-07-06 Thread Rico Sennrich
Hello Raj, can you please clarify if you tried to train a monolingual LM (NeuralLM), a bilingual LM (BilingualNPLM), or both? Our previous experiences with BilingualNPLM are mixed, and we observed improvements for some tasks and language pairs, but not for others. See for instance:

Re: [Moses-support] NPLM and BilingualNPLM not working as expected in Moses

2015-07-06 Thread Rico Sennrich
. What could have gone wrong during the training? Regards. On Mon, Jul 6, 2015 at 10:53 PM, Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr...@gmx.ch wrote: Hello Raj, can you please clarify if you tried to train a monolingual LM (NeuralLM), a bilingual LM (BilingualNPLM

Re: [Moses-support] Fwd: A small typo in Moses manual

2015-06-25 Thread Rico Sennrich
Hi Guchun. thanks - fixed. best wishes, Rico On 25.06.2015 16:05, Guchun Zhang wrote: Hi there, In Section 5.13.7 NPLM on Page 267, the option --words_file passed to prepareNeuralLM expects an existing file containing the words to be added in the vocabulary. Considering the line right

Re: [Moses-support] Major bug found in Moses

2015-06-20 Thread Rico Sennrich
On 19/06/15 19:21, Marcin Junczys-Dowmunt wrote: So, if anything, Moses is just a very flexible text-rewriting tool. Tuning (and data) turns into a translator, GEC tool, POS-tagger, Chunker, Semantic Tagger etc. that's a good point, and the basis of some criticism that can be levelled at the

Re: [Moses-support] please help me with the code - getting word index

2015-06-20 Thread Rico Sennrich
Hi Amir, There is currently no method that returns this, but BilingualLM (moses/LM/BilingualLM) calculates and uses the absolute source position of each terminal - search for absolute_source_position. best wishes, Rico On 20/06/15 14:35, amir haghighi wrote: Thanks Matthias

Re: [Moses-support] problem in translation

2015-06-19 Thread Rico Sennrich
you, Fatma El-Zahraa El -Taher Teaching Assistant at Computer System department Faculty of Engineering, Azhar University Email : fatmaelta...@gmail.com mailto:fatmaelta...@gmail.com mobile: +201141600434 On Fri, Jun 19, 2015 at 3:33 PM, Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr

Re: [Moses-support] problem in translation

2015-06-19 Thread Rico Sennrich
Hi Fatma, 2800 words is a very small dataset - typical systems are trained on millions of sentence pairs. It's possible that your word is in the training data, but not in the phrase table because it wasn't correctly aligned. Gregor's suggestion is also worth investigating. You should

Re: [Moses-support] problem in translation

2015-06-19 Thread Rico Sennrich
fatma elzahraa Eltaher fatmaeltaher@... writes: Dears, I have a problem in translation. After building Moses model , I try to test it by a  word but the output was the same word. I did not know where is the problem? could you help me? kindly find attached pic. thank you, hello

Re: [Moses-support] Major bug found in Moses

2015-06-19 Thread Rico Sennrich
Marcin Junczys-Dowmunt junczys@... writes: Hi Rico, since you are at it, some pointers to the more advanced pruning techniques that do perform better, please :) On 19.06.2015 19:25, Rico Sennrich wrote: [sorry for the garbled message before] you are right. The idea is pretty

Re: [Moses-support] Major bug found in Moses

2015-06-19 Thread Rico Sennrich
Read, James C jcread@... writes: So, all I did was filter out the less likely phrase pairs and the BLEU score shot up. Was that such a stroke of genius? Was that not blindingly obvious?  you are right. The idea is pretty obvious. It roughly corresponds to 'Histogram pruning' in this paper:

Re: [Moses-support] Major bug found in Moses

2015-06-19 Thread Rico Sennrich
[sorry for the garbled message before] you are right. The idea is pretty obvious. It roughly corresponds to 'Histogram pruning' in this paper: Zens, R., Stanton, D., Xu, P. (2012). A Systematic Comparison of Phrase Table Pruning Technique. In Proceedings of the 2012 Joint Conference on

Re: [Moses-support] How to tell EMS to use an existing binary LM

2015-06-19 Thread Rico Sennrich
-model.perl line 479 On Fri, Jun 19, 2015 at 2:17 PM, Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr...@gmx.ch wrote: Lane Schwartz dowobeha@... writes: Hi, I've looked through experiment.meta and the samples, and I haven't been able to figure this out. I

Re: [Moses-support] How to tell EMS to use an existing binary LM

2015-06-19 Thread Rico Sennrich
Lane Schwartz dowobeha@... writes: Hi, I've looked through experiment.meta and the samples, and I haven't been able to figure this out. I know this is simple, but I'm missing the syntax. How can I tell EMS to use an existing binarized LM that was trained previously? Thanks, Lane

Re: [Moses-support] Major bug found in Moses

2015-06-17 Thread Rico Sennrich
Read, James C jcread@... writes: I have been unable to find a logical explanation for this behaviour other than to conclude that there must be some kind of bug in Moses which causes a TM only run of Moses to perform poorly in finding the most likely translations according to the TM when there

Re: [Moses-support] Major bug found in Moses

2015-06-17 Thread Rico Sennrich
Read, James C jcread@... writes: Actually the approximation I expect to be: p(e|f)=p(f|e) Why would you expect this to give poor results if the TM is well trained? Surely the results of my filtering experiments provve otherwise. James I recommend you read the following:

[Moses-support] c++11 support

2015-06-16 Thread Rico Sennrich
Hi list, some code in mosesdecoder (oxlm, c++tokenizer) already requires c++11. To let people benefit from the usability and functionality improvements of c++11, it would be beneficial to allow the use of c++11 features in all of the code. before people start making big changes to the codebase,

Re: [Moses-support] kbmira error

2015-06-08 Thread Rico Sennrich
Hieu Hoang hieuhoang@... writes: Hi All Does anyone know why I get this error?    # $MOSES_DIR/bin/kbmira  --dense-init run3.dense --sparse-init run3.sparse-weights  --ffile run1.features.dat --ffile run2.features.dat --ffile run3.features.dat --scfile run1.scores.dat --scfile

Re: [Moses-support] keep some features fixed when tuning

2015-05-22 Thread Rico Sennrich
constituted by 574 segments). Vito M. 2015-05-20 14:38 GMT+02:00 Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr...@gmx.ch: Matthias Huck mhuck@... writes: Hi Vito, tuneable=false should work. Just my usual caveat: if you use 'tuneable=false', the feature

Re: [Moses-support] keep some features fixed when tuning

2015-05-20 Thread Rico Sennrich
Matthias Huck mhuck@... writes: Hi Vito, tuneable=false should work. Just my usual caveat: if you use 'tuneable=false', the feature score(s) won't be reported to the n-best list, and MERT/MIRA/PRO won't even know that the feature exists. This is appropriate in some cases (keeping a

Re: [Moses-support] How to tell EMS to concatenate training corpora

2015-05-20 Thread Rico Sennrich
Lane Schwartz dowobeha@... writes: I have a number of distinct monolingual corpora. I've been training them as separate LMs. I now want to run a variant where they are all concatenated together, and then trained as a single LM. The EMS walkthrough says this should be possible

Re: [Moses-support] How to tell EMS to concatenate training corpora

2015-05-18 Thread Rico Sennrich
Lane Schwartz dowobeha@... writes: I have a number of distinct monolingual corpora. I've been training them as separate LMs. I now want to run a variant where they are all concatenated together, and then trained as a single LM. The EMS walkthrough says this should be possible

Re: [Moses-support] How to tell EMS to concatenate training corpora

2015-05-18 Thread Rico Sennrich
Lane Schwartz dowobeha@... writes: I have a number of distinct monolingual corpora. I've been training them as separate LMs. I now want to run a variant where they are all concatenated together, and then trained as a single LM. The EMS walkthrough says this should be possible

Re: [Moses-support] Problem with corpus preparation

2015-03-28 Thread Rico Sennrich
On 28/03/15 13:26, Abdelfetah Boumerdas wrote: Hi Rico, Thank you so much for your help, the deescape-special-chers.perl code did the job perfectly and removed all the sepcial xml chars. Now i have another question, i followed the moses manual and trained moses on the news commentary corpus

Re: [Moses-support] Problem with corpus preparation

2015-03-26 Thread Rico Sennrich
Abdelfetah Boumerdas aa_boumerdas@... writes: Hi All, i'm trying to build a translation model using moses, and to do that i'm using 2 corpora (europarl and the news commentary corpus provided in the manual) but when i reached the corpus preparation step i noticed the following problem:

Re: [Moses-support] mert-moses.pl

2015-03-07 Thread Rico Sennrich
if it fit or not? thanks On Friday, March 6, 2015 1:12 PM, Rico Sennrich rico.sennr...@gmx.ch wrote: mohamed hasanien mhmd_hasnen@... mailto:mhmd_hasnen@... writes: HI all, Line 4739: Collecting options took 0.562 seconds at moses/Manager.cpp:117 sh: line 1: 13550 Killed /mhmd

Re: [Moses-support] mert-moses.pl

2015-03-06 Thread Rico Sennrich
mohamed hasanien mhmd_hasnen@... writes: HI all, Line 4739: Collecting options took 0.562 seconds at moses/Manager.cpp:117 sh: line 1: 13550 Killed                  /mhmd/mosesdecoder/bin/moses -config filtered/moses.ini -weight-overwrite 'PhrasePenalty0= 0.043478 WordPenalty0= -0.217391

Re: [Moses-support] Target-syntax

2015-02-27 Thread Rico Sennrich
Massinissa Ahmim massinissa.ahmim@... writes: Dear all, I'm trying to train a syntactic model english to german. I did the annotation on the target part using bitpar and ran :nohup /mosesdecoder/scripts/training/train-model.perl --glue-grammar --max-phrase-length 10

Re: [Moses-support] Untuneable feature score components?

2015-01-23 Thread Rico Sennrich
Matthias Huck mhuck@... writes: Hi, Is there any existing functionality to set only specific score components of a feature function as untuneable? Hi Matthias, There is the option --activate-features in mert-moses.pl, but I believe it only works for MERT (if at all), and it also had some

Re: [Moses-support] Moses-support post from hassaan84s@... requires approval

2015-01-08 Thread Rico Sennrich
Hieu Hoang hieuhoang@... writes: you mean the github version of mgiza?   https://github.com/moses-smt/mgiza Compared to what old version from where? I need to know what u ran and the corpus you ran. I'm not an expert on mgiza so my debugging is as good as yours in

Re: [Moses-support] how to compile with nplm library

2014-12-30 Thread Rico Sennrich
Xiaoqiang Feng feng.x.q.2006@... writes: Hi, nplm is one toolkit of neural probabilistic language model. This toolkit can be used in Moses for language model and bilingual LM(neural network joint model, ACL 2014). These two parts have been updated in github mosesdecoder. Hi, basic usage

Re: [Moses-support] Mgiza - lock contention?

2014-12-22 Thread Rico Sennrich
Marcin Junczys-Dowmunt junczys@... writes: will try to have a look what is going on there (I dread the code), but if anyone has some ideas, the same experience or wants to help that would be most welcome. It seems Model 1 is fine, and speed improves with a greater number of threads, but the

Re: [Moses-support] Train tree-to-tree model fail to generate right rules

2014-12-19 Thread Rico Sennrich
Steven Huang d98922047@... writes: Hi, I am trying to train an English to Chinese tree-to-tree model with manually generated corpus. The translation is unacceptable. It seems that the model doen't know reordering at all. So I look into the rule-table, there is no useful rule in it (see the

Re: [Moses-support] Getting OOV

2014-12-18 Thread Rico Sennrich
Fatemeh Eskandari fatemeh.eskandari.69@... writes: I need to have list of oov words in text file, is there any way to get them in moses? Hi Fatemeh, the moses option '-output-unknowns filename' might be what you want. It's also possible to get the OOV words without translating the text;

Re: [Moses-support] How to train a tree-to-tree model?

2014-12-04 Thread Rico Sennrich
Steven Huang d98922047@... writes: It seems that the XML is not correctly paresed and is taken as plain text. Is there anything wrong with my training configuration or training corpus? Thanks a lot. Hi Steven, The Moses XML format isn't pure and still cares about white space. Each sentence

Re: [Moses-support] how to test whether tcmalloc is used?

2014-11-26 Thread Rico Sennrich
Li Xiang lixiang.ict@... writes: I compile Moses with tcmalloc. How can I test whether tcmalloc is used and evaluate the performance ? there's probably many ways, but here's three: at compile time, you will see the following message if tcmalloc is not enabled: Tip: install tcmalloc for

Re: [Moses-support] How to train a tree-based model?

2014-11-25 Thread Rico Sennrich
Steven Huang d98922047@... writes: The question is: 1. Can I use all the 3 factors when training tree-based model? If yes, how the parallel corpus should be like? The XML format shown in the MOSES tutorial seems not able to accept factors except surface.  I've successfully tested a toy

Re: [Moses-support] How to implement String to Tree SMT

2014-10-13 Thread Rico Sennrich
Asad A.Malik asad_12204@... writes: I currently wanted to develop String to Tree SMT, I've successfully developed Phrase Based SMT using MOSES for Urdu (source) and English (target). Now I wanted to developed the String to Tree SMT for these two languages. I am still confused that how will I

Re: [Moses-support] perplexity scores

2014-10-08 Thread Rico Sennrich
koormoosh koormoosh@... writes: and then I query via:./ngram −lm text.arpa −ppl query.txt Hi Kormoosh, not sure if that's the only problem, but ngram does not automatically use the order of the ARPA file, but defaults to 3. ./ngram -order 5 −lm text.arpa −ppl query.txt should get you closer

Re: [Moses-support] perplexity scores

2014-10-08 Thread Rico Sennrich
oh, you're also using different smoothing, and possibly different handling of unknown words. lmplz defaults to SRILM's|| '-interpolate -kndiscount -unk -gt3min 1 -gt4min 1 -gt5min 1' On 08/10/14 10:05, koormoosh wrote: Thanks. Now it's 15 score closer to the KenLM, but still the difference

[Moses-support] format change in phrase table halves (because of bug with hiero systems)

2014-03-02 Thread Rico Sennrich
Hi list, I swapped the score and alignment column of the phrase table halves in commit 01bc3c1. The format of the final phrase table is not affected. If you know of any side effects (such as downstream software that relies on the column order in the phrase table halves), please complain here.

Re: [Moses-support] --activate-features in mert-moses.perl not working?

2014-02-26 Thread Rico Sennrich
On 26.02.2014 07:24, moses-support-requ...@mit.edu wrote: Hi Hieu, Rico, this does not seem to be an issue with the ini-file. It actually works as well with stand-alone moses. The issue seems to be the mert-moses.pl script which switches off features that are not returned by the decoder

Re: [Moses-support] --activate-features in mert-moses.perl not working?

2014-02-24 Thread Rico Sennrich
Marcin Junczys-Dowmunt junczys@... writes: And with tuneable=false it seems the features are being ignored during decoding, I understand this should not be happening. I get much worse translation results with an ini-file that has tuneable=false for all features than with the same ini

Re: [Moses-support] --activate-features in mert-moses.perl not working?

2014-02-10 Thread Rico Sennrich
Marcin Junczys-Dowmunt junczys@... writes: Hi, it seems --activate-features=STRING is not working in mert-moses.perl. The script prints a message that the ignored features are not being used, but then optimizes them anyway. I can see that the enabled information in the feature data

Re: [Moses-support] help

2014-01-31 Thread Rico Sennrich
Leila Tavakoli leila.tavakoliii@... writes: Hi, I need to tune my English-Persian development set, and i use this command:  root at ubuntu:/home/msho/Desktop/Shafagh/moses-default/scripts/training# ./mert-moses.pl /mnt/hgfs/H/dev/en/en.txt /mnt/hgfs/H/dev/ref/

Re: [Moses-support] translation probabilities in a string-to-tree system

2014-01-27 Thread Rico Sennrich
Marion Weller wellermn@... writes: To my understanding, the target-side non-terminals are copied to the source-side string for technical reasons only in a string-to-tree system: shouldn't then source-side strings as in the example above be counted as one string (according to your [X]) instead

Re: [Moses-support] C++11

2014-01-16 Thread Rico Sennrich
On 15.01.2014 23:23, Jie Jiang wrote: Sorry Rico, my bad, I didn't turn on that switch. Now it compiled. However, c++11 was not specified in Jamroot, how did your compiler have that flag on? I added the switch -std=c++0x to Jamroot, which is synonymous with -std=c++11 for gcc (see commit

Re: [Moses-support] C++11

2014-01-15 Thread Rico Sennrich
Marcin Junczys-Dowmunt junczys@... writes: Revision d2d508184e35909aa5da901b81bb70f10f7794c7 breaks my compact reordering model, but at runtime and only if you do a clean build without any build artifacts from earlier compilations. It segfaults during loading in a weird low-level place.

Re: [Moses-support] C++11

2014-01-15 Thread Rico Sennrich
Rico Sennrich rico.sennrich@... writes: I just pushed a commit that uses a C++11 feature (initalizer list). It should work with compilers that are no older than 5 years or so (gcc = 4.4). I reverted the commits again; there were issues with gcc 4.6, and apparently some different problems

Re: [Moses-support] C++11

2014-01-15 Thread Rico Sennrich
On 15.01.2014 16:58, Jie Jiang wrote: Hi Roco: I think in the future it would be better to wrap up the c11 code with macros like: #if __cplusplus = 201103L //c11 code here # //old code here #endif or else it will fail others who are not using a compiler with c11

[Moses-support] C++11

2014-01-14 Thread Rico Sennrich
Hi list, I just pushed a commit that uses a C++11 feature (initalizer list). It should work with compilers that are no older than 5 years or so (gcc = 4.4). If you have trouble compiling it (because you're using an older gcc version or another compiler), please speak up. This is basically a test

Re: [Moses-support] problem with weighting translation models using client_multimodel.py

2014-01-02 Thread Rico Sennrich
is Ubuntu 12.04.2 LTS Best regards! 25.12.2013, 19:07, Rico Sennrich rico.sennr...@gmx.ch: I can't immediately see anything wrong with your config. Can you tell me which version (git commit) of Moses you're using, and if there is an error message on the side of the moses server? It's

Re: [Moses-support] problem with weighting translation models using client_multimodel.py

2013-12-23 Thread Rico Sennrich
Калинин Александр verbalab@... writes: Hi, everyone! I have a problem with weighting two translation models using client_multimodel.py. When I function like that (with no weights): translate(['i have a dream'],server) it's ok - Moses responses me with translation via xmlrpc. But

Re: [Moses-support] merging two translation models

2013-12-05 Thread Rico Sennrich
Adding/removing models during decoding is not currently supported. The code for loading feature functions (including translation models) has recently been refactored, and I don't know how easy it would be to add such a functionality in this new framework. On 04.12.2013 18:00, Калинин Александр

Re: [Moses-support] RELEASE 2.0

2013-11-18 Thread Rico Sennrich
Tom Hoar tahoar@... writes: Hi Hieu, A while back, I contributed a Python example client for mosesserver but I don't see it in the repository. I attached it again here for inclusion with the original Java example client. Tom Hi Tom, there's an example client in

  1   2   >