Re: [Moses-support] [Moses-developers] Generation models with Mmsapt

2015-09-01 Thread Hieu Hoang

It should work. The function
  EvaluateInIsolation()
in the LM is for optimisation reason. eg. if the target phrase is 'a b c 
d' and the LM is a trigram, the trigrams 'a b c' and 'b c d' can be 
precalculated in EvaluateInIsolation().


Implementing a pt for factors requires setting up some variables, which 
may not have happen yet in mmsapt. if you can send me a small example 
model, i'll see what i can do


On 01/09/2015 02:11, Ulrich Germann wrote:

Hi Michael,

I have no experience with factored models, so I'm speculating here to 
some degree. The reason the phrase table calls EvaluateInIsolation is 
because all "isolated" phrase scores are considered when pruning. In 
my opinion pruning should not happen within the phrase tables (for 
exactly the reason that it does not allow feature functions to be 
agnostic about other feature functions) but by whatever object calls 
all the phrase tables and does the generation. However, for software 
legacy reasons, that's the way it is right now, and I'm not likely to 
address this issue any time soon myself. The most reasonable fix for 
this in my opinion is to move pruning where it belongs --- post all 
the factor generation stuff.


Hieu is probably still the person with the best understanding of how 
factored phrase table entry generation works, so maybe he can chime in 
on this ...


Cheers - Uli


On Mon, Aug 31, 2015 at 11:29 PM, Michael Denkowski 
mailto:michael.j.denkow...@gmail.com>> 
wrote:


Hi Ulrich,

I was looking into using a class-based LM with your dynamic phrase
table via generation models.  I translate factor 0 to 0 with the
Mmsapt, then generate target factor 1 (word class) with a GM.  The
class-based LM operates on factor 1.

I'm hitting a segfault on what appears to be an
order-of-operations issue with the PT and LM.  In mmsapt.cpp:578,
Mmsapt::mkTPhrase makes a call to tp->EvaluateInIsolation.  This
calls all of the models, including the LMs.  The class LM tries to
score factor 1, which doesn't exist yet (since generation happens
after translation), and it dies.  By nature, other phrase tables
don't have this issue since they can just pull up pre-computed scores.

Is scoring with all of the models here a strategic choice to get
better performance or would it be sufficient to just score with
the PT features?  Thanks!

--Michael




--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh


___
Moses-developers mailing list
moses-develop...@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-developers


--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] g++: error: unrecognized command line option '-no-cpp-precomp'

2015-09-01 Thread Jorg Tiedemann

This is kind of frustrating … so, the recommended way is to use apples clang 
and to built boost from source, is that correct?
I thought I could pull gcc and boost out of macpots (as I used to do) and they 
would understand each other, but this does not seem to work. Why not?

Well, thanks anyway. I will try with a fresh boost built ...
Jörg




> On 01 Sep 2015, at 15:21, Hieu Hoang  wrote:
> 
> My advice on osx is don't install GCC. Clang is the ordained compiler now, 
> you'll be fighting apple every step of the way. Don't think different!
> 
> Hieu Hoang
> Sent while bumping into things
> 
> On 31 Aug 2015 5:14 pm, "Jorg Tiedemann"  > wrote:
> 
> Well, I have /opt/local/ search paths in various environment variables to get 
> macports to work.
> I deleted all this paths and tried again but I still get the same problem.
> 
> I am confused. And why is gcc not working anymore when installed via 
> macports? I also installed boost with macports. Is that a problem as well?
> 
> I have also some problems with kenlm but part of it compiles and links fine. 
> build_binary and query seems to compile fine but lmplz does not link because 
> of some undefined symbols:
> Undefined symbols for architecture x86_64:
>   
> "boost::program_options::value_semantic_codecvt_helper::parse(boost::any&,
>  std::vector, 
> std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, bool) const", 
> referenced from:
> ….
> 
> I also had to link /opt/local/lib to /opt/local/lib64 (which didn’t exist in 
> my setup).
> I am afraid that I started to make quite a mess on my system but what did I 
> do wrong?
> 
> Is macports not working properly anymore?
> As I said, I have gcc 5.2.0 and boost 1.59.0 via macports on my system. Is 
> that bad?
> 
> Thanks for helping!
> Jörg
> 
> 
> 
> 
>> On 31 Aug 2015, at 16:19, Hieu Hoang > > wrote:
>> 
>> the errors for clang looks like it's coming from the stl library. Have you 
>> fiddled with the PATH variable or otherwise tried to make gcc on OSX work? 
>> You shouldn't do that, it will just mess up the compilation environment on 
>> your machine
>> 
>> On 31/08/2015 10:28, Jorg Tiedemann wrote:
>>> 
>>> Unfortunately, this didn’t work for me either. I attach both logiles - one 
>>> for clang and one for gcc (which I installed via macports)
>>> What can I do? Thanks!
>>> 
>>> Jörg
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
 On 30 Aug 2015, at 11:33, Hieu Hoang < 
 hieuho...@gmail.com 
 > wrote:
 
 Add
toolset=clang
 to the bjam compile command. Osx no longer has gcc
 
 Hieu Hoang
 Sent while bumping into things
 
 On 29 Aug 2015 11:56 pm, "Jorg Tiedemann" >>> > wrote:
 Hi,
 
 I tried to make a fresh install of Moses on my new Mac and I get the 
 following error
 g++: error: unrecognized command line option '-no-cpp-precomp'
 
 What’s wrong? I have gcc5 and boost 1.59 on my machine via macports ...
 
 Thanks for your help!
 Jörg
 
 
 
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu 
 http://mailman.mit.edu/mailman/listinfo/moses-support 
 
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu 
 http://mailman.mit.edu/mailman/listinfo/moses-support 
 
>>> 
>> 
>> -- 
>> Hieu Hoang
>> Researcher
>> New York University, Abu Dhabi
>> http://www.hoang.co.uk/hieu 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] g++: error: unrecognized command line option '-no-cpp-precomp'

2015-09-01 Thread Matt Post
You do not need gcc; Apple's stock compiler (installed via Xcode) is fine. If 
you've installed it, I'd recommend uninstalling it, and if you can't, make sure 
that /opt/local/bin is last in your path, so that /usr/bin/gcc is found first.

I've also had a lot of trouble with the Macports boost installation, which uses 
the "--layout=tagged" argument to the boost installer, instead of the default 
"--layout=system". The difference is the tagged layout adds compile options to 
the library name (e.g., "-mt"). However, I think that Moses compilation tool 
figures this out.

matt


> On Sep 1, 2015, at 3:58 PM, Jorg Tiedemann  wrote:
> 
> 
> This is kind of frustrating … so, the recommended way is to use apples clang 
> and to built boost from source, is that correct?
> I thought I could pull gcc and boost out of macpots (as I used to do) and 
> they would understand each other, but this does not seem to work. Why not?
> 
> Well, thanks anyway. I will try with a fresh boost built ...
> Jörg
> 
> 
> 
> 
>> On 01 Sep 2015, at 15:21, Hieu Hoang > > wrote:
>> 
>> My advice on osx is don't install GCC. Clang is the ordained compiler now, 
>> you'll be fighting apple every step of the way. Don't think different!
>> 
>> Hieu Hoang
>> Sent while bumping into things
>> 
>> On 31 Aug 2015 5:14 pm, "Jorg Tiedemann" > > wrote:
>> 
>> Well, I have /opt/local/ search paths in various environment variables to 
>> get macports to work.
>> I deleted all this paths and tried again but I still get the same problem.
>> 
>> I am confused. And why is gcc not working anymore when installed via 
>> macports? I also installed boost with macports. Is that a problem as well?
>> 
>> I have also some problems with kenlm but part of it compiles and links fine. 
>> build_binary and query seems to compile fine but lmplz does not link because 
>> of some undefined symbols:
>> Undefined symbols for architecture x86_64:
>>   
>> "boost::program_options::value_semantic_codecvt_helper::parse(boost::any&,
>>  std::vector, 
>> std::allocator >, std::allocator> std::char_traits, std::allocator > > > const&, bool) const", 
>> referenced from:
>> ….
>> 
>> I also had to link /opt/local/lib to /opt/local/lib64 (which didn’t exist in 
>> my setup).
>> I am afraid that I started to make quite a mess on my system but what did I 
>> do wrong?
>> 
>> Is macports not working properly anymore?
>> As I said, I have gcc 5.2.0 and boost 1.59.0 via macports on my system. Is 
>> that bad?
>> 
>> Thanks for helping!
>> Jörg
>> 
>> 
>> 
>> 
>>> On 31 Aug 2015, at 16:19, Hieu Hoang >> > wrote:
>>> 
>>> the errors for clang looks like it's coming from the stl library. Have you 
>>> fiddled with the PATH variable or otherwise tried to make gcc on OSX work? 
>>> You shouldn't do that, it will just mess up the compilation environment on 
>>> your machine
>>> 
>>> On 31/08/2015 10:28, Jorg Tiedemann wrote:
 
 Unfortunately, this didn’t work for me either. I attach both logiles - one 
 for clang and one for gcc (which I installed via macports)
 What can I do? Thanks!
 
 Jörg
 
 
 
 
 
 
 
 
> On 30 Aug 2015, at 11:33, Hieu Hoang < 
> hieuho...@gmail.com 
> > wrote:
> 
> Add
>toolset=clang
> to the bjam compile command. Osx no longer has gcc
> 
> Hieu Hoang
> Sent while bumping into things
> 
> On 29 Aug 2015 11:56 pm, "Jorg Tiedemann"  > wrote:
> Hi,
> 
> I tried to make a fresh install of Moses on my new Mac and I get the 
> following error
> g++: error: unrecognized command line option '-no-cpp-precomp'
> 
> What’s wrong? I have gcc5 and boost 1.59 on my machine via macports ...
> 
> Thanks for your help!
> Jörg
> 
> 
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support 
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support 
> 
 
>>> 
>>> -- 
>>> Hieu Hoang
>>> Researcher
>>> New York University, Abu Dhabi
>>> http://www.hoang.co.uk/hieu 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Several Issues with Baseline and EMS

2015-09-01 Thread Anita Pal
Hey!

I'm really, really new to Linux and have no idea how to set /change the
config file when it comes to running the experiments as described here (
http://www.statmt.org/moses/?n=moses.baseline)

home-dir = /home/liam/

working-dir =/home/liam/working/experiments
moses-src-dir = /home/liam/mosesdecoder
moses-script-dir = home/liam/working/experiments (this is where config is
located?)
moses-bin-dir = /home/liam/mosesdecoder/bin
external-bin-dir =/home/liam/mosesdecoder/tools
data-dir =/home/liam/corpus
train-dir =/home/liam/corpus/training
dev-dir = /home/liam/corpus/dev
irstlm-dir =/home/liam/irstlm/bin

Is this correct? I really have no idea ):

Because I just keep getting errors no matter what I do. I have the same
problem when it comes to training the language model via IRSTLM. For
example:

export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
   -i news-commentary-v8.fr-en.sb.en  \
   -t ./tmp  -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
 ~/irstlm/bin/compile-lm  \
   --text=yes \
   news-commentary-v8.fr-en.lm.en.gz \
   news-commentary-v8.fr-en.arpa.en

I assume this is a separate command. I keep getting errors though I do
set HOME=/home/liam/irstlm.

What am I doing wrong?
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] clarification CBPT vs MMSAPT

2015-09-01 Thread Vincent Nguyen

I didn't make myself clear.
I don't want to add material "dynamically", I want to add some new 
corpus to an existing one.


What I don't know is how to align the new corpus incrementally to an 
existing one that has been aligned with fastalign.


clearer ?

by the way,in what you mention below, the new material alignment file is 
generated based on the new material only or including the existing 
alignment ?



Le 01/09/2015 15:32, Ulrich Germann a écrit :

Hi Vincent,

 1. To seed the foreground corpus at start-up, you need to provide
three files (I use ${L1} and ${L2} to indicate language tags,
${L1} is the source language, ${L2} the target language. These
tags must match those given in the L1 and L2 parameters of the
Mmsapt line in moses.ini.

/some/path/[basename.]${L1}.txt.gz
/some/path/[basename.]${L2}.txt.gz
/some/path/[basename.]${L1}-${L2}.symal.gz

Then, in the Mmsapt line in moses.ini, add the parameter
extra=/some/path/[basename.]

Note that  the extra specifiation (like the path parameter) must
end either in '.' (of the files have a prefix) or '/' (if they
don't). Files must be gzipped and end in .txt.gz or symal.gz,
respectively.

 2. To add material dynamically:

  o with the moses server, use the update interface of the xmlrpc
server; see scripts/contrib/sim-pe.py for an example.
  o to simulate post-editing with moses in batch mode, specify
--spe-src /path/to/source --spe-trg /path/to/target --spe-aln
/path/to/word-alignment-file.
E.g.

moses -f moses.ini --spe-src input.en --spe-trg reference.de
 --spe-aln en-de.symal

This will translate one sentence, then add input sentence,
reference (as read from file), and pre-computed word alignment
to the parallel data.
In this case (in contrast to the parameter 'extra ' in the
Mmsapt line, which mandates that the text files are gzipped),
the files should be plain, uncompressed text files.

- Uli

On Tue, Sep 1, 2015 at 1:11 PM, Vincent Nguyen > wrote:


Hi Uli,

For your point3. here is what I would like to do / understand :

I have an LM and a TM built with EMS but alignment being done by
FastAlign. So there is no vcb files for the baseline.

In this context I don't see if I can to integrate a new
incremental corpus to the previous baseline corpus.

hope this is clearer.

Vincent



Le 23/08/2015 00:36, Ulrich Germann a écrit :

Hi Vincent,

1. I don't use EMS, so I'm the wrong person to ask.
2. Please always post questions to the moses-support mailing
list, so that others can benefit from questions and answers as well.
3. Can you briefly explain what you are trying to accomplish? I
don't think I understand what you are actually trying to do.

Best regards - Uli

On Sat, Aug 22, 2015 at 10:45 PM, Vincent Nguyen mailto:vngu...@neuf.fr>> wrote:


I kept reading again and again this
http://www.statmt.org/moses/?n=Advanced.Incremental
but this is not clear enough for a newbie like me for use
with EMS.
I also see a section in the EMS config file :
use of baseline aligment model (incremental training)
and I don't really see how it comes with the rest of parameters.



Le 22/08/2015 16:31, vngu...@neuf.fr 
a écrit :

Oops
Using EMS i built the phrase table with the mmsapt=
Option and it went through
But i had not added the training-options
-final-alignment-model hmm

Do i need to start again?

The thing is i use dyers aligner because of the giga corpus
and i am not sure that training option is compatible since
the tuto mentions giza++ modified...





De : "Ulrich Germann"
Date : 21 août 2015 15:54:08
A : Vincent Nguyen
Cc : prash...@fbk.eu ,
moses-support@mit.edu 
Sujet : Re: [Moses-support] clarification CBPT vs MMSAPT



On Thu, Aug 20, 2015 at 5:40 PM, Vincent Nguyen
mailto:vngu...@neuf.fr>> wrote:

Thanks to both of you. I will it a try to both solutions.

For MMSAPT :
Will I be able to make it work with the Giga corpus
fr-en ? If everything is loaded in memory I may be short
of ram rather quickly.


For the WMT-15 fr-en data, mmsapt's files are about 20GB in
total, but not all of it will normally be kept in memory.
Mmsapt degrades gracefully, it just gets slow if the VM
manager has to drop memory pages and re-load them. The LM is
about 40GB, so for optimal performance you should calculate
60+GB of RAM. Provided you have enough RAM, cat all model
files to /dev/

Re: [Moses-support] really weird phrase table crash .....

2015-09-01 Thread Vincent Nguyen

yes plenty.

Le 01/09/2015 17:41, Christophe Servan a écrit :
> Hello Vincent,
> Did you checked whether you have enough disk space?
>
> Best,
>
> Christophe
>
>
> -Message d'origine-
> De : moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu] De
> la part de Vincent Nguyen
> Envoyé : mardi 1 septembre 2015 17:07
> À : moses-support 
> Objet : [Moses-support] really weird phrase table crash .
>
> Hi,
>
> I don't know what is happening, but during the phrase table building
> (inverse part) in the ../model/tmp.23625 directory I have plenty of files :
> but 4 .coc files are missing (number 14 , 15 , 23, 24 don't know why) and
> then when putting all together it crashes because can't find these 4
>
> phrase-table.half.014.gz.coc (and same for 15 23 24)
>
> (out of 37 parts)
>
> any clue ?
>
>
> Using SCRIPTS_ROOTDIR: /home/moses/mosesdecoder/scripts using gzip
> (6) score phrases @ Tue Sep  1 07:17:00 CEST 2015
> (6.1)  creating table half
> /home/moses/working/model/phrase-table.2.half.f2e @ Tue Sep  1 07:17:00 CEST
> 2015
> /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort
> --compress-program gzip --parallel 8"
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/extract.sorted.gz
> /home/moses/working/model/lex.2.f2e
> /home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
> Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl
> 8 "sort   --compress-program gzip --parallel 8"
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/extract.sorted.gz
> /home/moses/working/model/lex.2.f2e
> /home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
> (6.1)  creating table half
> /home/moses/working/model/phrase-table.2.half.e2f @ Tue Sep  1 07:17:00 CEST
> 2015
> /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort
> --compress-program gzip --parallel 8"
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/extract.inv.sorted.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1
> Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl
> 8 "sort   --compress-program gzip --parallel 8"
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/extract.inv.sorted.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1 using gzip
> Started Tue Sep  1 07:17:00 2015 Started Tue Sep  1 07:17:00 2015
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.0.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.000.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.1.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.001.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.2.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.002.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.3.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.003.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.4.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.004.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.5.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.005.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.6.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.006.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.7.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.007.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.8.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.008.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.9.gz
> /home/moses/working/model/lex.2.e2f
> /home/moses/working/model/tmp.23626/phrase-table.half.009.gz
> --Inverse  2>> /dev/stderr
> /home/moses/mosesdecoder/scripts/../bin/score
> /home/moses/working/model/tmp.23626/extract.10.gz
> /home/moses/working/

Re: [Moses-support] really weird phrase table crash .....

2015-09-01 Thread Christophe Servan
Hello Vincent,
Did you checked whether you have enough disk space?

Best,

Christophe


-Message d'origine-
De : moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu] De
la part de Vincent Nguyen
Envoyé : mardi 1 septembre 2015 17:07
À : moses-support 
Objet : [Moses-support] really weird phrase table crash .

Hi,

I don't know what is happening, but during the phrase table building
(inverse part) in the ../model/tmp.23625 directory I have plenty of files :
but 4 .coc files are missing (number 14 , 15 , 23, 24 don't know why) and
then when putting all together it crashes because can't find these 4

phrase-table.half.014.gz.coc (and same for 15 23 24)

(out of 37 parts)

any clue ?


Using SCRIPTS_ROOTDIR: /home/moses/mosesdecoder/scripts using gzip
(6) score phrases @ Tue Sep  1 07:17:00 CEST 2015
(6.1)  creating table half
/home/moses/working/model/phrase-table.2.half.f2e @ Tue Sep  1 07:17:00 CEST
2015
/home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort   
--compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/extract.sorted.gz
/home/moses/working/model/lex.2.f2e
/home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 
8 "sort   --compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/extract.sorted.gz
/home/moses/working/model/lex.2.f2e
/home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
(6.1)  creating table half
/home/moses/working/model/phrase-table.2.half.e2f @ Tue Sep  1 07:17:00 CEST
2015
/home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort   
--compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/extract.inv.sorted.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1
Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 
8 "sort   --compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/extract.inv.sorted.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1 using gzip
Started Tue Sep  1 07:17:00 2015 Started Tue Sep  1 07:17:00 2015
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.0.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.000.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.1.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.001.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.2.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.002.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.3.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.003.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.4.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.004.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.5.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.005.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.6.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.006.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.7.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.007.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.8.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.008.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.9.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.009.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.10.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.23626/phrase-table.half.010.gz
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score
/home/moses/working/model/tmp.23626/extract.11.gz
/home/moses/working/model/lex.2.e2f
/home/moses/working/model/tmp.

[Moses-support] really weird phrase table crash .....

2015-09-01 Thread Vincent Nguyen
Hi,

I don't know what is happening, but during the phrase table building 
(inverse part)
in the ../model/tmp.23625 directory I have plenty of files :
but 4 .coc files are missing (number 14 , 15 , 23, 24 don't know why)
and then when putting all together it crashes because can't find these 4

phrase-table.half.014.gz.coc (and same for 15 23 24)

(out of 37 parts)

any clue ?


Using SCRIPTS_ROOTDIR: /home/moses/mosesdecoder/scripts
using gzip
(6) score phrases @ Tue Sep  1 07:17:00 CEST 2015
(6.1)  creating table half 
/home/moses/working/model/phrase-table.2.half.f2e @ Tue Sep  1 07:17:00 
CEST 2015
/home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort   
--compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/extract.sorted.gz 
/home/moses/working/model/lex.2.f2e 
/home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 
8 "sort   --compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/extract.sorted.gz 
/home/moses/working/model/lex.2.f2e 
/home/moses/working/model/phrase-table.2.half.f2e.gz  --GoodTuring 0
(6.1)  creating table half 
/home/moses/working/model/phrase-table.2.half.e2f @ Tue Sep  1 07:17:00 
CEST 2015
/home/moses/mosesdecoder/scripts/generic/score-parallel.perl 8 "sort   
--compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/extract.inv.sorted.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1
Executing: /home/moses/mosesdecoder/scripts/generic/score-parallel.perl 
8 "sort   --compress-program gzip --parallel 8" 
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/extract.inv.sorted.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/phrase-table.2.half.e2f.gz --Inverse  1
using gzip
Started Tue Sep  1 07:17:00 2015
Started Tue Sep  1 07:17:00 2015
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.0.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.000.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.1.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.001.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.2.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.002.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.3.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.003.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.4.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.004.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.5.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.005.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.6.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.006.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.7.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.007.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.8.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.008.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.9.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.009.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.10.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.010.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.11.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.011.gz 
--Inverse  2>> /dev/stderr
/home/moses/mosesdecoder/scripts/../bin/score 
/home/moses/working/model/tmp.23626/extract.12.gz 
/home/moses/working/model/lex.2.e2f 
/home/moses/working/model/tmp.23626/phrase-table.half.012.gz 
--Inverse  

Re: [Moses-support] clarification CBPT vs MMSAPT

2015-09-01 Thread Ulrich Germann
Hi Vincent,


   1. To seed the foreground corpus at start-up, you need to provide three
   files (I use ${L1} and ${L2} to indicate language tags, ${L1} is the source
   language, ${L2} the target language. These tags must match those given in
   the L1 and L2 parameters of the Mmsapt line in moses.ini.

   /some/path/[basename.]${L1}.txt.gz
   /some/path/[basename.]${L2}.txt.gz
   /some/path/[basename.]${L1}-${L2}.symal.gz

   Then, in the Mmsapt line in moses.ini, add the parameter
   extra=/some/path/[basename.]

   Note that  the extra specifiation (like the path parameter) must end
   either in '.' (of the files have a prefix) or '/' (if they don't). Files
   must be gzipped and end in .txt.gz or symal.gz, respectively.

   2. To add material dynamically:


   - with the moses server, use the update interface of the xmlrpc server;
  see scripts/contrib/sim-pe.py for an example.
  - to simulate post-editing with moses in batch mode, specify
  --spe-src /path/to/source --spe-trg /path/to/target --spe-aln
  /path/to/word-alignment-file.
  E.g.

  moses -f moses.ini --spe-src input.en --spe-trg reference.de
  --spe-aln en-de.symal

  This will translate one sentence, then add input sentence, reference
  (as read from file), and pre-computed word alignment to the
parallel data.
  In this case (in contrast to the parameter 'extra ' in the Mmsapt
  line, which mandates that the text files are gzipped), the files
should be
  plain, uncompressed text files.

- Uli

On Tue, Sep 1, 2015 at 1:11 PM, Vincent Nguyen  wrote:

> Hi Uli,
>
> For your point3. here is what I would like to do / understand :
>
> I have an LM and a TM built with EMS but alignment being done by
> FastAlign. So there is no vcb files for the baseline.
>
> In this context I don't see if I can to integrate a new incremental corpus
> to the previous baseline corpus.
>
> hope this is clearer.
>
> Vincent
>
>
>
> Le 23/08/2015 00:36, Ulrich Germann a écrit :
>
> Hi Vincent,
>
> 1. I don't use EMS, so I'm the wrong person to ask.
> 2. Please always post questions to the moses-support mailing list, so that
> others can benefit from questions and answers as well.
> 3. Can you briefly explain what you are trying to accomplish? I don't
> think I understand what you are actually trying to do.
>
> Best regards - Uli
>
> On Sat, Aug 22, 2015 at 10:45 PM, Vincent Nguyen  wrote:
>
>>
>> I kept reading again and again this
>> 
>> http://www.statmt.org/moses/?n=Advanced.Incremental
>> but this is not clear enough for a newbie like me for use with EMS.
>> I also see a section in the EMS config file :
>> use of baseline aligment model (incremental training)
>> and I don't really see how it comes with the rest of parameters.
>>
>>
>>
>> Le 22/08/2015 16:31, vngu...@neuf.fr a écrit :
>>
>> Oops
>> Using EMS i built the phrase table with the mmsapt=
>> Option and it went through
>> But i had not added the training-options
>> -final-alignment-model hmm
>>
>> Do i need to start again?
>>
>> The thing is i use dyers aligner because of the giga corpus and i am not
>> sure that training option is compatible since the tuto mentions giza++
>> modified...
>>
>>
>>
>> 
>>
>> De : "Ulrich Germann"
>> Date : 21 août 2015 15:54:08
>> A : Vincent Nguyen
>> Cc : prash...@fbk.eu, moses-support@mit.edu
>> Sujet : Re: [Moses-support] clarification CBPT vs MMSAPT
>>
>>
>>
>> On Thu, Aug 20, 2015 at 5:40 PM, Vincent Nguyen  wrote:
>>
>>> Thanks to both of you. I will it a try to both solutions.
>>>
>>> For MMSAPT :
>>> Will I be able to make it work with the Giga corpus fr-en ? If
>>> everything is loaded in memory I may be short of ram rather quickly.
>>>
>>
>> For the WMT-15 fr-en data, mmsapt's files are about 20GB in total, but
>> not all of it will normally be kept in memory. Mmsapt degrades gracefully,
>> it just gets slow if the VM manager has to drop memory pages and re-load
>> them. The LM is about 40GB, so for optimal performance you should calculate
>> 60+GB of RAM. Provided you have enough RAM, cat all model files to
>> /dev/null prior to starting moses. Sequential disk access is much faster
>> than random disk access, and the cat to /dev/null will push them into the
>> OS's file cache.
>>
>>
>>
>>> Plus I was using dyers fast align ... so do I need to realign the whole
>>> corpus with the modified version of giza++ ?
>>>
>>> You need word alignments in the output format produced by symal (ie.
>> row-column pairs 1-1 2-2 3-4 etc.). How these alignments are produced
>> doesn't matter for Mmsapts ability to handle them. It may, of course,
>> affect the alignment quality, but that's independent of which phrase table
>> implementation you use.
>>
>> - Uli
>>
>>
>>
>>> For CBPT :
>>> I would like to give the the MT adative server a try but I don't really
>>> understand how to adapt the given "adaptive model" and "updater model"
>>> in a context where my la

Re: [Moses-support] Failure to Open Output when using Chart Decoder

2015-09-01 Thread Rico Sennrich

Hello Shyam,

this is probably not a bug in the code (this is a check in 
std::ostream), but a problem with the location you're trying to write 
to. Can you double-check if your path to the n-best-list is correct, and 
that you can write to it?


best wishes,
Rico


On 01.09.2015 00:36, Shyam Upadhyay wrote:
I am new to using moses and I am trying to use the chart decoder to 
obtain 100 best decodings as follows,


moses/bin/moses_chart -f mymodel/moses.ini --drop-unknown 
--n-best-list myout/hyp.mrl.nbest 100


I encounter the following error,

Start loading text phrase table. Moses format : [0.009] seconds
Reading 
/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/glue-grammar

5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
max-chart-span: 20
max-chart-span: 1000
Created input-output object : [0.009] seconds
Exception: ./moses/OutputCollector.h:64 in 
Moses::OutputCollector::OutputCollector(std::string, std::string) 
threw util::Exception because `!m_outStream->good()'.
Failed to open output 
file/home/upadhya3/smt-semparse-fresh/work/2015-08-30T19.44.07/hyp.mrl.nbest


My moses.ini file is, (this was generated automatically by previous steps)

#
### MOSES CONFIG FILE ###
#

# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0
1 T 1

[cube-pruning-pop-limit]
1000

[non-terminals]
X

[search-algorithm]
3

[inputtype]
3

[max-chart-span]
20
1000

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4 
path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/rule-table.gz 
input-factor=0 output-factor=0
PhraseDictionaryMemory name=TranslationModel1 num-features=1 
path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/glue-grammar 
input-factor=0 output-factor=0 tuneable=true


KENLM name=LM0 factor=0 
path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/mrl.arpa 
order=3


# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation quality. You 
MUST tune the weights.
# Documentation for tuning is here: 
http://www.statmt.org/moses/?n=FactoredTraining.Tuning

UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
TranslationModel1= 1
LM0= 0.5


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] g++: error: unrecognized command line option '-no-cpp-precomp'

2015-09-01 Thread Hieu Hoang
My advice on osx is don't install GCC. Clang is the ordained compiler now,
you'll be fighting apple every step of the way. Don't think different!

Hieu Hoang
Sent while bumping into things
On 31 Aug 2015 5:14 pm, "Jorg Tiedemann"  wrote:

>
> Well, I have /opt/local/ search paths in various environment variables to
> get macports to work.
> I deleted all this paths and tried again but I still get the same problem.
>
> I am confused. And why is gcc not working anymore when installed via
> macports? I also installed boost with macports. Is that a problem as well?
>
> I have also some problems with kenlm but part of it compiles and links
> fine. build_binary and query seems to compile fine but lmplz does not link
> because of some undefined symbols:
> Undefined symbols for architecture x86_64:
>
> "boost::program_options::value_semantic_codecvt_helper::parse(boost::any&,
> std::vector,
> std::allocator >, std::allocator std::char_traits, std::allocator > > > const&, bool) const",
> referenced from:
> ….
>
> I also had to link /opt/local/lib to /opt/local/lib64 (which didn’t exist
> in my setup).
> I am afraid that I started to make quite a mess on my system but what did
> I do wrong?
>
> Is macports not working properly anymore?
> As I said, I have gcc 5.2.0 and boost 1.59.0 via macports on my system. Is
> that bad?
>
> Thanks for helping!
> Jörg
>
>
>
>
> On 31 Aug 2015, at 16:19, Hieu Hoang  wrote:
>
> the errors for clang looks like it's coming from the stl library. Have you
> fiddled with the PATH variable or otherwise tried to make gcc on OSX work?
> You shouldn't do that, it will just mess up the compilation environment on
> your machine
>
> On 31/08/2015 10:28, Jorg Tiedemann wrote:
>
>
> Unfortunately, this didn’t work for me either. I attach both logiles - one
> for clang and one for gcc (which I installed via macports)
> What can I do? Thanks!
>
> Jörg
>
>
>
>
>
>
>
>
> On 30 Aug 2015, at 11:33, Hieu Hoang < 
> hieuho...@gmail.com> wrote:
>
> Add
>toolset=clang
> to the bjam compile command. Osx no longer has gcc
>
> Hieu Hoang
> Sent while bumping into things
> On 29 Aug 2015 11:56 pm, "Jorg Tiedemann"  wrote:
>
>> Hi,
>>
>> I tried to make a fresh install of Moses on my new Mac and I get the
>> following error
>> g++: error: unrecognized command line option '-no-cpp-precomp'
>>
>> What’s wrong? I have gcc5 and boost 1.59 on my machine via macports ...
>>
>> Thanks for your help!
>> Jörg
>>
>>
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] clarification CBPT vs MMSAPT

2015-09-01 Thread Vincent Nguyen

Hi Uli,

For your point3. here is what I would like to do / understand :

I have an LM and a TM built with EMS but alignment being done by 
FastAlign. So there is no vcb files for the baseline.


In this context I don't see if I can to integrate a new incremental 
corpus to the previous baseline corpus.


hope this is clearer.

Vincent


Le 23/08/2015 00:36, Ulrich Germann a écrit :

Hi Vincent,

1. I don't use EMS, so I'm the wrong person to ask.
2. Please always post questions to the moses-support mailing list, so 
that others can benefit from questions and answers as well.
3. Can you briefly explain what you are trying to accomplish? I don't 
think I understand what you are actually trying to do.


Best regards - Uli

On Sat, Aug 22, 2015 at 10:45 PM, Vincent Nguyen > wrote:



I kept reading again and again this
http://www.statmt.org/moses/?n=Advanced.Incremental
but this is not clear enough for a newbie like me for use with EMS.
I also see a section in the EMS config file :
use of baseline aligment model (incremental training)
and I don't really see how it comes with the rest of parameters.



Le 22/08/2015 16:31, vngu...@neuf.fr  a
écrit :

Oops
Using EMS i built the phrase table with the mmsapt=
Option and it went through
But i had not added the training-options
-final-alignment-model hmm

Do i need to start again?

The thing is i use dyers aligner because of the giga corpus and i
am not sure that training option is compatible since the tuto
mentions giza++ modified...





De : "Ulrich Germann"
Date : 21 août 2015 15:54:08
A : Vincent Nguyen
Cc : prash...@fbk.eu ,
moses-support@mit.edu 
Sujet : Re: [Moses-support] clarification CBPT vs MMSAPT



On Thu, Aug 20, 2015 at 5:40 PM, Vincent Nguyen mailto:vngu...@neuf.fr>> wrote:

Thanks to both of you. I will it a try to both solutions.

For MMSAPT :
Will I be able to make it work with the Giga corpus fr-en ?
If everything is loaded in memory I may be short of ram
rather quickly.


For the WMT-15 fr-en data, mmsapt's files are about 20GB in
total, but not all of it will normally be kept in memory. Mmsapt
degrades gracefully, it just gets slow if the VM manager has to
drop memory pages and re-load them. The LM is about 40GB, so for
optimal performance you should calculate 60+GB of RAM. Provided
you have enough RAM, cat all model files to /dev/null prior to
starting moses. Sequential disk access is much faster than random
disk access, and the cat to /dev/null will push them into the
OS's file cache.

Plus I was using dyers fast align ... so do I need to realign
the whole corpus with the modified version of giza++ ?

You need word alignments in the output format produced by symal
(ie. row-column pairs 1-1 2-2 3-4 etc.). How these alignments are
produced doesn't matter for Mmsapts ability to handle them. It
may, of course, affect the alignment quality, but that's
independent of which phrase table implementation you use.

- Uli

For CBPT :
I would like to give the the MT adative server a try but I
don't really understand how to adapt the given "adaptive
model" and "updater model"
in a context where my language pair is different. these
preliminary steps are not part of the tutorial. (especially
the updater_models/alignment folders ...)

The only glitch I see in the CBPT is that adaptive changes
cannot be made permanent.




Le 20/08/2015  16:17, Ulrich Germann a écrit :

Memory-mapped phrase tables are an alternative to
conventional phrase tables. They are much, much faster to
build, only slightly slower than CompactPT at runtime, and
at the very least competitive in terms of BLEU performance.
I usually observe slightly higher BLEU scores, but for each
individual evaluation, the difference is usually not
significant. They support only phrase-based MT, but not
syntax-based MT.

Both Mmsapt and CBPT also cater to post-editing scenarios
(CBPT were specifically developed for this purpose). They
allow adding new material to the phrase tables at run time.
I can't say much about CBPT (apparently you add phrase table
entries, and there is a decay function that rewards more
recent choices approved by the translator), but in the case
of Mmsapt (since it samples at lookup time anyway), you can
add new word-aligned parallel text at run time to the
training data (or additional material at start-up; additions
are currently not stored on disk by the server (do NOT use
mosesserver, use moses --server --port ...) an

Re: [Moses-support] Decoding Speed perfomance - suggestion and question

2015-09-01 Thread Vincent Nguyen


I am not clear with the syntax of filter-model-given-input.pl
target-dir = where we want the filtered PT to go ?
moses.ini = if it is not in the above directory the script does accept it
input.txt = ? what is it in the case I just want to adjust the MinScore ?


Le 31/08/2015 16:54, Philipp Koehn a écrit :

Hi,

the filter script filters an existing phrase table. With the EMS 
settings, it would build another phrase table.


Don't worry about the reordering table. It will have excess entries, 
but they will not be used.
If you really care, you can used the script 
scripts/training/remove-orphan-phrase-pairs-from-reordering-table.perl


-phi

On Mon, Aug 31, 2015 at 10:50 AM, Vincent Nguyen > wrote:



thanks, will try and post results.
just to be clear:
I can re-use the previous extract file
I have to rebuild the phrase-table with new min score (ie no way
to just filter the previous one ?)
do I have to rebuild the reordering table too ?

Vincent


Le 31/08/2015 16:44, Philipp Koehn a écrit :

hI,

0.0001 should have no impact on translation quality,
0.001 will have some impact
0.01 is probably a bit too drastic.

But that's the range you should explore.

-phi

On Mon, Aug 31, 2015 at 10:33 AM, Vincent Nguyen mailto:vngu...@neuf.fr>> wrote:

is there any benchmark on what value / what impact ?
what should I start with as a test 0.001 ?

the standard value 0.0001 seems really really low to me 
maybe I am not getting what this probability exactly refers to.



where |FIELDn| is the position of the score (typically 2 for
the direct phrase probability p(e|f), or 0 for the indirect
phrase probability p(f|e)) and |THRESHOLD| the maximum
probability allowed. A good setting is |2:0.0001|, which
removes all rules, where the direct phrase translation
probability is below 0.0001.



Le 31/08/2015 16:14, Philipp Koehn a écrit :

Hi,

I would suspect that with beam sizes <500 the bulk of the
time is
spent on translation option collection, not decoding. You
could speed
that up with tighter threshold pruning of the phrase table.

See the script scripts/training/threshold-filter.perl or the
setting
score-settings = "--MinScore 2:0.0001"
in EMS.

-phi

On Mon, Aug 31, 2015 at 3:03 AM, Vincent Nguyen
mailto:vngu...@neuf.fr>> wrote:

Hi,

Here are some results with several values with cube
pruning pop limit :

(pop limit / decoding time for 3000 sentences / BLEU score)

5000 - 15m45 - 29.59
1000 - 4m27 - 29.59
500 - 3m35 - 29.59
200 - 3m15 - 29.51
100 - 3m00 - 29.40

Therefore I took 400 - 3m19 - 29.58

If I am not mistaken the default value for Moses is 1000
[read in the
doc] but in the EMS
it is 5000 right now  which makes the experience so
long .
I suggest to change the EMS default value.

Is there a way to also use a cube pruning limit in the
decoder at Tuning
time ?

Now with this optimized setting I get a ration of 15
segments per second
in average.
What is the reason for online tools like Google / Bing
to be much much
faster.
it's not a machine issue, is it ?


Cheers
Vincent

___
Moses-support mailing list
Moses-support@mit.edu 
http://mailman.mit.edu/mailman/listinfo/moses-support










___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support