Re: [Moses-support] tuning takes tooooo long:(
Hello Amir I think your tuning process will go faster if you use a multi-threaded Mert. /home/mert-moses.pl --threads 4 you have of course tu indicate 8 instead of 4 if your laptop is equipped with eight cores Best regards Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com a écrit : Hello I have a corpus with 400'000 sentences for training, 1000 sentences for tuning and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, with my old laptop. I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run ems! it is 3 days that it is in the tuning step and it is not finished yet. Is it possible that it gets in an endless loop? How can I check it's process? regards Amir ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] tuning takes tooooo long:(
Thank you arezki and yohit I don't know how can I change multi-thread setting in ems config file. On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune arezkisado...@yahoo.frwrote: Hello Amir I think your tuning process will go faster if you use a multi-threaded Mert. /home/mert-moses.pl --threads 4 you have of course tu indicate 8 instead of 4 if your laptop is equipped with eight cores Best regards Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com a écrit : Hello I have a corpus with 400'000 sentences for training, 1000 sentences for tuning and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, with my old laptop. I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run ems! it is 3 days that it is in the tuning step and it is not finished yet. Is it possible that it gets in an endless loop? How can I check it's process? regards Amir ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] tuning takes tooooo long:(
Hi Amir You can add decoder-settings = -threads 4 to your TUNING stanza. Also try filter-settings = -MinScore 2:0.0001 for more aggressive filtering. Running tuning on a laptop though is always going to be slow, cheers - Barry On 14/02/14 09:26, amir haghighi wrote: Thank you arezki and yohit I don't know how can I change multi-thread setting in ems config file. On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune arezkisado...@yahoo.fr mailto:arezkisado...@yahoo.fr wrote: Hello Amir I think your tuning process will go faster if you use a multi-threaded Mert. /home/mert-moses.pl http://mert-moses.pl --threads 4 you have of course tu indicate 8 instead of 4 if your laptop is equipped with eight cores Best regards Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com mailto:amir.haghighi...@gmail.com a écrit : Hello I have a corpus with 400'000 sentences for training, 1000 sentences for tuning and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, with my old laptop. I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run ems! it is 3 days that it is in the tuning step and it is not finished yet. Is it possible that it gets in an endless loop? How can I check it's process? regards Amir ___ Moses-support mailing list Moses-support@mit.edu mailto:Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Fwd: moses decoder build failure
Hi all, I can't compile Moses with bjam as I always get a failed warning, and there is no moses executable in the resulting bin folder (the decoder itself). I can't tell if any other file is missing from bin, seems to be OK to me. command: sudo ./bjam -j8 error msg: ...failed gcc.link moses-cmd/bin/gcc-4.8/release/debug-symbols-on/link-static/threading-multi/moses... gcc.compile.c++ mert/bin/gcc-4.8/release/debug-symbols-on/link-static/threading-multi/TER/tercalc.o mert/TER/tercalc.cpp: In member function âTERCpp::terAlignment TERCpp::terCalc::MinEditDist(std::vectorstd::basic_stringchar , std::vectorstd::basic_stringchar , std::vectorstd::vectorint )â: mert/TER/tercalc.cpp:451:7: warning: variable âlast_peakâ set but not used [-Wunused-but-set-variable] int last_peak = 0; ^ (Please ignore encoding errors) I use libboost 1.49. thank you in advance, Viktor ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Scoring of human-translated sentences for Computer Aided Proofing
Hi folks, I am interested in using an existing translation model/language model to score (human) translated text on a sentence by sentence basis. Is it possible to do this with moses? As an example, under normal use moses might output the following for das ist ein kleines haus ** BEST TRANSLATION: this is a small house [1] [total=-28.923] ** What I would like to do is provide the already translated pair (eg das ist ein kleines haus/this is a small house) and see what the log probability is of that translation, using the usual scoring probabilities (phrase translation/language model/distortion model/word penalty). Thanks in advance for your thoughts. Julian --- Julian Myerscough Quality Assurance Manager - Languages for Business Ltd Languages for Business Ltd PO Box 5194, Cardiff CF5 9DZ UK Tel: +44 (0)29 2044 4400 Fax: +44 (0)29 2044 4401 jul...@lfbtranslations.co.uk www.LfBtranslations.co.uk Office hours: 9:00 - 17:00 UTC/GMT 4:00 - 12:00 EST ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Get plain text from the output of a translation
Hi, following the baseline instructions I've tokenized and recased the text before training. And consequently I get similar output when translating. Are there any scripts available to get back a normal text from the output? Especially the html-encoding for some characters e.g. the french é, è and ê makes reading uncomfortable. A production system would have to produce readable output anyway. What's the standard work flow? Yours, Per Tunedal ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] tuning takes tooooo long:(
Thank you Barry, I use IRSTLM to build the language model. Can I use multi-thread for decoder-setting? I get IRST LM is not threadsafe error. I want to use IRSTLM, is there any other way to speed up the tuning step ? Regrads On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow bhad...@staffmail.ed.ac.ukwrote: Hi Amir You can add decoder-settings = -threads 4 to your TUNING stanza. Also try filter-settings = -MinScore 2:0.0001 for more aggressive filtering. Running tuning on a laptop though is always going to be slow, cheers - Barry On 14/02/14 09:26, amir haghighi wrote: Thank you arezki and yohit I don't know how can I change multi-thread setting in ems config file. On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune arezkisado...@yahoo.frwrote: Hello Amir I think your tuning process will go faster if you use a multi-threaded Mert. /home/mert-moses.pl --threads 4 you have of course tu indicate 8 instead of 4 if your laptop is equipped with eight cores Best regards Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com a écrit : Hello I have a corpus with 400'000 sentences for training, 1000 sentences for tuning and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, with my old laptop. I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run ems! it is 3 days that it is in the tuning step and it is not finished yet. Is it possible that it gets in an endless loop? How can I check it's process? regards Amir ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Hierarchical training
does your input file contain anything? On 13 February 2014 15:18, Jean D'Ennris jean.derr...@gmail.com wrote: Dear all, I'm currently experimenting the syntactic model. I've successfully trained a small moses using the command below : nohup /home/mosesdecoder/scripts/training/train-model.perl --hierarchical --extract-options=--MaxSpan 15 --score-options=--GoodTuring -root-dir /home/Massi/oldmoses -corpus /home/Massi/proj-syndicate.1000.0-0 -f de -e en -lm 0:3:/home/Massi/LM.sur.en.blm -external-bin-dir /root/external-bin-dir/ -mgiza -mgiza-cpus 24 training.out the rule-table has been generated, but as I run : mosesdecoder/bin/moses_chart -f moses.ini in out.stt the message below loadtxt_ram() 8-grams: reading 0 entries done level 8 2-grams: reading 0 entries done level 2 1-grams: reading 0 entries done level 1 done starting to use OOV words [unk] OOV code is 0 OOV code is 0 IRST: m_unknownId=0 ScoreProducer: LM start: 2 end: 3 Finished loading LanguageModels : [0.051] seconds Start loading PhraseTable /home/Massi/oldmoses/model/rule-table.gz : [0.051] seconds filePath: /home/Massi/oldmoses/model/rule-table.gz ScoreProducer: PhraseModel start: 3 end: 8 Finished loading phrase tables : [0.051] seconds max-chart-span: 20 Start loading phrase table from /home/Massi/oldmoses/model/rule-table.gz : [0.051] seconds Start loading text SCFG phrase table. Moses format : [0.052] seconds Reading /home/Massi/oldmoses/model/rule-table.gz 5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 Finished loading phrase tables : [1.489] seconds IO from STDOUT/STDIN Created input-output object : [1.489] seconds End. : [1.489] seconds user1.416 sys 0.072 VmPeak: 324376 kB VmRSS:167760 kB reset mmap and the output file is empty Many thanks Jean E. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses model trained on irstlm Ubuntu not running on Fedora 8/14
It sounds like the 2 different Moses bnaries aren't linked to the same version of IRSTLM. I recommand using IRSTLM v. 5.80.03. The IRSTLM version in the sourceforge repository has some bugs. Also, when you recompile Moses with the new IRSTLM, add -a to the bjam command so that it compile everything from the beginning On 12 February 2014 20:30, Rishabh Srivastava ris@gmail.com wrote: Hi, I built a translation model using Moses on Ubuntu which runs perfectly on other Ubuntu systems (with Moses), but when I tried to run the same model on fedora 8/14, I got this error: Binary file has version 5 but this implementation expects version 1 so you'll have to rebuild your binary LM from the ARPA. I tried to rebuild the model on my Ubuntu system with kenlm but it again gives an error on tuning with mert. Please help me out. PS. I have mosesdecoder 2.1 on both Ubuntu and Fedora 14. Thanks. Rishabh Srivastava ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] tuning takes tooooo long:(
Hi Amir Even if you use IRSTLM to build the language model, you can still use KenLM for decoding. Make sure you create an arpa file with IRSTLM, then use build_binary to binarise it so that it loads quickly with KenLM. Then you can use multi-threaded decoding, cheers - Barry On 14/02/14 13:01, amir haghighi wrote: Thank you Barry, I use IRSTLM to build the language model. Can I use multi-thread for decoder-setting? I get IRST LM is not threadsafe error. I want to use IRSTLM, is there any other way to speed up the tuning step ? Regrads On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow bhad...@staffmail.ed.ac.uk mailto:bhad...@staffmail.ed.ac.uk wrote: Hi Amir You can add decoder-settings = -threads 4 to your TUNING stanza. Also try filter-settings = -MinScore 2:0.0001 for more aggressive filtering. Running tuning on a laptop though is always going to be slow, cheers - Barry On 14/02/14 09:26, amir haghighi wrote: Thank you arezki and yohit I don't know how can I change multi-thread setting in ems config file. On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune arezkisado...@yahoo.fr mailto:arezkisado...@yahoo.fr wrote: Hello Amir I think your tuning process will go faster if you use a multi-threaded Mert. /home/mert-moses.pl http://mert-moses.pl --threads 4 you have of course tu indicate 8 instead of 4 if your laptop is equipped with eight cores Best regards Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com mailto:amir.haghighi...@gmail.com a écrit : Hello I have a corpus with 400'000 sentences for training, 1000 sentences for tuning and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, with my old laptop. I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run ems! it is 3 days that it is in the tuning step and it is not finished yet. Is it possible that it gets in an endless loop? How can I check it's process? regards Amir ___ Moses-support mailing list Moses-support@mit.edu mailto:Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu mailto:Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Get plain text from the output of a translation
Hi Per, The standard workflow is to run a postprocessing step on the output, e.g. with scripts/tokenizer/detokenizer.perl in Moses. Usage ./detokenizer.perl (-l [en|fr|it|cs|...]) tokenizedfile detokenizedfile Options: -u ... uppercase the first char in the final sentence. -q ... don't report detokenizer revision. -b ... disable Perl buffering. -penn ... assume input is tokenized as per tokenizer.perl's -penn option. If you are using EMS, you might want to integrate this into your pipeline in the following way: [EVALUATION] detokenizer = $moses-script-dir/tokenizer/detokenizer.perl -l $output-extension Cheers, Matthias On Fri, 2014-02-14 at 13:14 +0100, Per Tunedal wrote: Hi, following the baseline instructions I've tokenized and recased the text before training. And consequently I get similar output when translating. Are there any scripts available to get back a normal text from the output? Especially the html-encoding for some characters e.g. the french é, è and ê makes reading uncomfortable. A production system would have to produce readable output anyway. What's the standard work flow? Yours, Per Tunedal ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support