Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread Arezki Sadoune
Hello Amir
I think your tuning process will go faster if you use a multi-threaded Mert. 
/home/mert-moses.pl --threads 4 
 you have of course tu indicate 8 instead of 4 if your laptop is equipped with 
eight cores
Best regards



Le Vendredi 14 février 2014 8h27, amir haghighi amir.haghighi...@gmail.com a 
écrit :
 
Hello

I have a corpus with 400'000 sentences for training, 1000 sentences for tuning 
and 100'000 sentences for test. I couldn't run ems on my corpus, after 3 days, 
with my old laptop. 
I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still run 
ems! it is 3 days that it is in the tuning step and it is not finished yet. 
Is it possible that it gets in an endless loop?

How can I check it's process?

regards
Amir

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread amir haghighi
Thank you arezki and yohit

I don't know how can I change multi-thread setting in ems config file.



On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune arezkisado...@yahoo.frwrote:

 Hello Amir
 I think your tuning process will go faster if you use a multi-threaded
 Mert.
 /home/mert-moses.pl --threads 4
  you have of course tu indicate 8 instead of 4 if your laptop is equipped
 with eight cores
 Best regards


   Le Vendredi 14 février 2014 8h27, amir haghighi 
 amir.haghighi...@gmail.com a écrit :
  Hello

 I have a corpus with 400'000 sentences for training, 1000 sentences for
 tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
 after 3 days, with my old laptop.
 I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still
 run ems! it is 3 days that it is in the tuning step and it is not finished
 yet.
 Is it possible that it gets in an endless loop?
 How can I check it's process?

 regards
 Amir

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread Barry Haddow

Hi Amir

You can add

decoder-settings = -threads 4

to your TUNING stanza.

Also try

filter-settings = -MinScore 2:0.0001

for more aggressive filtering.

Running tuning on a laptop though is always going to be slow,

cheers - Barry

On 14/02/14 09:26, amir haghighi wrote:

Thank you arezki and yohit

I don't know how can I change multi-thread setting in ems config file.



On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune 
arezkisado...@yahoo.fr mailto:arezkisado...@yahoo.fr wrote:


Hello Amir
I think your tuning process will go faster if you use a
multi-threaded Mert.
/home/mert-moses.pl http://mert-moses.pl --threads 4
 you have of course tu indicate 8 instead of 4 if your laptop is
equipped with eight cores
Best regards


Le Vendredi 14 février 2014 8h27, amir haghighi
amir.haghighi...@gmail.com mailto:amir.haghighi...@gmail.com a
écrit :
Hello

I have a corpus with 400'000 sentences for training, 1000
sentences for tuning and 100'000 sentences for test. I couldn't
run ems on my corpus, after 3 days, with my old laptop.
I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I
can't still run ems! it is 3 days that it is in the tuning step
and it is not finished yet.
Is it possible that it gets in an endless loop?
How can I check it's process?

regards
Amir

___
Moses-support mailing list
Moses-support@mit.edu mailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support





___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Fwd: moses decoder build failure

2014-02-14 Thread Viktor Pless
Hi all,

I can't compile Moses with bjam as I always get a failed warning, and
there is no moses executable in the resulting bin folder (the decoder
itself). I can't tell if any other file is missing from bin, seems to be OK
to me.

command: sudo ./bjam -j8

error msg:

...failed gcc.link
moses-cmd/bin/gcc-4.8/release/debug-symbols-on/link-static/threading-multi/moses...
gcc.compile.c++
mert/bin/gcc-4.8/release/debug-symbols-on/link-static/threading-multi/TER/tercalc.o
mert/TER/tercalc.cpp: In member function âTERCpp::terAlignment
TERCpp::terCalc::MinEditDist(std::vectorstd::basic_stringchar ,
std::vectorstd::basic_stringchar , std::vectorstd::vectorint )â:
mert/TER/tercalc.cpp:451:7: warning: variable âlast_peakâ set but not used
[-Wunused-but-set-variable]
   int last_peak = 0;
   ^

(Please ignore encoding errors)

I use libboost 1.49.

thank you in advance,
Viktor
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Scoring of human-translated sentences for Computer Aided Proofing

2014-02-14 Thread Julian Myerscough
Hi folks,

I am interested in using an existing translation model/language model to 
score (human) translated text on a sentence by sentence basis.

Is it possible to do this with moses?

As an example, under normal use moses might output the following for 
das ist ein kleines haus
**
BEST TRANSLATION: this is a small house [1] [total=-28.923]
**

What I would like to do is provide the already translated pair (eg das 
ist ein kleines haus/this is a small house) and see what the log 
probability is of that translation, using the usual scoring 
probabilities (phrase translation/language model/distortion model/word 
penalty).

Thanks in advance for your thoughts.

Julian


---

Julian Myerscough
Quality Assurance Manager - Languages for Business Ltd

Languages for Business Ltd
PO Box 5194, Cardiff CF5 9DZ UK
Tel: +44 (0)29 2044 4400  Fax: +44 (0)29 2044 4401
jul...@lfbtranslations.co.uk www.LfBtranslations.co.uk

Office hours:
9:00 - 17:00 UTC/GMT   4:00 - 12:00 EST
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Get plain text from the output of a translation

2014-02-14 Thread Per Tunedal

Hi,
following the baseline instructions I've tokenized and recased the text
before training. And consequently I get similar output when translating.

Are there any scripts available to get back a normal text from the
output? Especially the html-encoding for some characters e.g. the french
é, è and ê makes reading uncomfortable. A production system would have
to produce readable output anyway.

What's the standard work flow?

Yours,
Per Tunedal

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread amir haghighi
Thank you Barry,

I use IRSTLM to build the language model. Can I use multi-thread for
decoder-setting?
I get IRST LM is not threadsafe error.

I want to use IRSTLM, is there any other way to speed up the tuning step ?

Regrads


On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow bhad...@staffmail.ed.ac.ukwrote:

  Hi Amir

 You can add

 decoder-settings = -threads 4

 to your TUNING stanza.

 Also try

 filter-settings = -MinScore 2:0.0001

 for more aggressive filtering.

 Running tuning on a laptop though is always going to be slow,

 cheers - Barry


 On 14/02/14 09:26, amir haghighi wrote:

  Thank you arezki and yohit

  I don't know how can I change multi-thread setting in ems config file.



 On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune 
 arezkisado...@yahoo.frwrote:

   Hello Amir
 I think your tuning process will go faster if you use a multi-threaded
 Mert.
 /home/mert-moses.pl --threads 4
  you have of course tu indicate 8 instead of 4 if your laptop is equipped
 with eight cores
  Best regards


   Le Vendredi 14 février 2014 8h27, amir haghighi 
 amir.haghighi...@gmail.com a écrit :
   Hello

  I have a corpus with 400'000 sentences for training, 1000 sentences for
 tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
 after 3 days, with my old laptop.
 I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still
 run ems! it is 3 days that it is in the tuning step and it is not finished
 yet.
  Is it possible that it gets in an endless loop?
  How can I check it's process?

  regards
  Amir

  ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support





 ___
 Moses-support mailing 
 listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Hierarchical training

2014-02-14 Thread Hieu Hoang
does your input file contain anything?


On 13 February 2014 15:18, Jean D'Ennris jean.derr...@gmail.com wrote:

 Dear all,

 I'm currently experimenting the syntactic model. I've successfully trained
 a small moses using the command below :

 nohup /home/mosesdecoder/scripts/training/train-model.perl --hierarchical
 --extract-options=--MaxSpan 15 --score-options=--GoodTuring -root-dir
 /home/Massi/oldmoses -corpus  /home/Massi/proj-syndicate.1000.0-0 -f de -e
 en  -lm 0:3:/home/Massi/LM.sur.en.blm -external-bin-dir
 /root/external-bin-dir/  -mgiza -mgiza-cpus 24  training.out 

 the rule-table has been generated, but as I run :

 mosesdecoder/bin/moses_chart -f moses.ini  in  out.stt

 the message below

 loadtxt_ram()
 8-grams: reading 0 entries
 done level 8
 2-grams: reading 0 entries
 done level 2
 1-grams: reading 0 entries
 done level 1
 done
 starting to use OOV words [unk]
 OOV code is 0
 OOV code is 0
 IRST: m_unknownId=0
 ScoreProducer: LM start: 2 end: 3
 Finished loading LanguageModels : [0.051] seconds
 Start loading PhraseTable /home/Massi/oldmoses/model/rule-table.gz :
 [0.051] seconds
 filePath: /home/Massi/oldmoses/model/rule-table.gz
 ScoreProducer: PhraseModel start: 3 end: 8
 Finished loading phrase tables : [0.051] seconds
 max-chart-span: 20
 Start loading phrase table from /home/Massi/oldmoses/model/rule-table.gz :
 [0.051] seconds
 Start loading text SCFG phrase table. Moses  format : [0.052] seconds
 Reading /home/Massi/oldmoses/model/rule-table.gz

 5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100

 
 Finished loading phrase tables : [1.489] seconds
 IO from STDOUT/STDIN
 Created input-output object : [1.489] seconds
 End. : [1.489] seconds
 user1.416
 sys 0.072
 VmPeak:   324376 kB
 VmRSS:167760 kB
 reset mmap

 and the output file is empty

 Many thanks

 Jean E.

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support




-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses model trained on irstlm Ubuntu not running on Fedora 8/14

2014-02-14 Thread Hieu Hoang
It sounds like the 2 different Moses bnaries aren't linked to the same
version of IRSTLM.

I recommand using IRSTLM v. 5.80.03. The IRSTLM version in the sourceforge
repository has some bugs.

Also, when you recompile Moses with the new IRSTLM, add
   -a
to the bjam command so that it compile everything from the beginning



On 12 February 2014 20:30, Rishabh Srivastava ris@gmail.com wrote:

 Hi,

 I built a translation model using Moses on Ubuntu which runs perfectly on
 other Ubuntu systems (with Moses), but when I tried to run the same model
 on fedora 8/14, I got this error:
 Binary file has version 5 but this implementation expects version 1 so
 you'll have to rebuild your binary LM from the ARPA.

 I tried to rebuild the model on my Ubuntu system with kenlm but it again
 gives an error on tuning with mert.

 Please help me out.

 PS. I have mosesdecoder 2.1 on both Ubuntu and Fedora 14.

 Thanks.
 Rishabh Srivastava


 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support




-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread Barry Haddow

Hi Amir

Even if you use IRSTLM to build the language model, you can still use 
KenLM for decoding. Make sure you create an arpa file with IRSTLM, then 
use build_binary to binarise it so that it loads quickly with KenLM. 
Then you can use multi-threaded decoding,


cheers - Barry

On 14/02/14 13:01, amir haghighi wrote:

Thank you Barry,

I use IRSTLM to build the language model. Can I use multi-thread for 
decoder-setting?

I get IRST LM is not threadsafe error.

I want to use IRSTLM, is there any other way to speed up the tuning step ?

Regrads


On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow 
bhad...@staffmail.ed.ac.uk mailto:bhad...@staffmail.ed.ac.uk wrote:


Hi Amir

You can add

decoder-settings = -threads 4

to your TUNING stanza.

Also try

filter-settings = -MinScore 2:0.0001

for more aggressive filtering.

Running tuning on a laptop though is always going to be slow,

cheers - Barry


On 14/02/14 09:26, amir haghighi wrote:

Thank you arezki and yohit

I don't know how can I change multi-thread setting in ems config
file.



On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune
arezkisado...@yahoo.fr mailto:arezkisado...@yahoo.fr wrote:

Hello Amir
I think your tuning process will go faster if you use a
multi-threaded Mert.
/home/mert-moses.pl http://mert-moses.pl --threads 4
 you have of course tu indicate 8 instead of 4 if your laptop
is equipped with eight cores
Best regards


Le Vendredi 14 février 2014 8h27, amir haghighi
amir.haghighi...@gmail.com
mailto:amir.haghighi...@gmail.com a écrit :
Hello

I have a corpus with 400'000 sentences for training, 1000
sentences for tuning and 100'000 sentences for test. I
couldn't run ems on my corpus, after 3 days, with my old laptop.
I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I
can't still run ems! it is 3 days that it is in the tuning
step and it is not finished yet.
Is it possible that it gets in an endless loop?
How can I check it's process?

regards
Amir

___
Moses-support mailing list
Moses-support@mit.edu mailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support





___
Moses-support mailing list
Moses-support@mit.edu  mailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support





___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Get plain text from the output of a translation

2014-02-14 Thread Matthias Huck
Hi Per,

The standard workflow is to run a postprocessing step on the output,
e.g. with scripts/tokenizer/detokenizer.perl in Moses.

Usage ./detokenizer.perl (-l [en|fr|it|cs|...])  tokenizedfile  
detokenizedfile
Options:
  -u ... uppercase the first char in the final sentence.
  -q ... don't report detokenizer revision.
  -b ... disable Perl buffering.
  -penn  ... assume input is tokenized as per tokenizer.perl's -penn option.


If you are using EMS, you might want to integrate this into your
pipeline in the following way:

[EVALUATION]
detokenizer = $moses-script-dir/tokenizer/detokenizer.perl -l 
$output-extension

Cheers,
Matthias


On Fri, 2014-02-14 at 13:14 +0100, Per Tunedal wrote:
 Hi,
 following the baseline instructions I've tokenized and recased the text
 before training. And consequently I get similar output when translating.
 
 Are there any scripts available to get back a normal text from the
 output? Especially the html-encoding for some characters e.g. the french
 é, è and ê makes reading uncomfortable. A production system would have
 to produce readable output anyway.
 
 What's the standard work flow?
 
 Yours,
 Per Tunedal
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support