Re: [Moses-support] Issues running MGiza on AWS machine

2018-08-01 Thread James Baker
Thanks Tom - using the Moses version of symal rather than the MGiza version
fixed it (although still not sure why it should be different to the one I
built on my desktop). I hadn't realised they were different, as the
instructions on the Moses website state you should copy all binaries from
MGiza into the Moses directory:
http://www.statmt.org/moses/?n=Moses.ExternalTools#ntoc3

Thanks for your help.

James

On Wed, 1 Aug 2018 at 02:14, Tom Hoar  wrote:

> Hi James,
>
> Since train-model.perl fails at step 4 fails with the MGIZA binaries you
> build on your AWS machine, but succeeds when you copy MGIZA binaries that
> you built on your local Ubuntu 16.04 machine, do the build logs show a
> missing dependency?
>
> My next question, why don't you just use the binaries that work? It seems
> like the AWS machine's Ubuntu distro is missing dependencies and the
> MGIZA++ build failed. Check those build logs.
>
> If you want to troubleshoot deeper, you need to backtrack from step 4. The
> train-model.perl step 4 uses the output from step 3, i.e. the word
> alignment file. Check if that word alignment file is corrupted.
>
> Then check step 3, its inputs are the GIZA alignment files output in step
> 2. This step uses the symal binary executable. Make sure you're using the
> Moses version of symal, not the one in the MGIZA library.
> http://article.gmane.org/gmane.comp.nlp.moses.user/11544
> http://moses-support.mit.narkive.com/KpKC2TQn/which-symal
>
> Backtracking to step 2, log lines with the following text messages should
> cause the mgiza executable to terminate but it doesn't. The parallel forks
> in train-model.perl mask the failure, processing continues and you
> experience ambiguous failures downstream.
>
> ERROR: A SOURCE or TARGET sentence has a zero-length sentence.
> ERROR! DUPLICATED ENTRY
> WARNING: The following sentence pair has source/target sentence length
> ration more than
>
> There are rarely errors in Step 1, but if you are experiencing a compile
> error on AWS, those MGIZA binaries in step 1 could be the cause.
>
> Also, the C++ binary executables are not the only things that change when
> you use use the alternate build. If you also copied the merge_alignment.py,
> this could be a problem in train-model.perl step 2. Make sure the AWS build
> has this in the right place and that it runs on the AWS distro's Python
> interpreter.
>
> Tom
>
>
>
> On 7/31/2018 11:01 PM, moses-support-requ...@mit.edu wrote:
>
> Date: Tue, 31 Jul 2018 11:41:01 +0100
> From: James Baker  
> Subject: [Moses-support] Issues running MGiza on AWS machine
> To: moses-support@mit.edu
> Message-ID:
>
> 
> Content-Type: text/plain; charset="utf-8"
>
> Hi,
>
> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
> successfully built some translation models on my Ubuntu 16.04 desktop
> machine. I'd now like to do the same thing, but on a machine hosted in AWS.
>
> I'm using the same operating system, and as far as I can tell all my
> versions are identical. The build of MGiza++ runs fine, reports no errors,
> and produces output the same as on my desktop machine. However, when I try
> to build the models, I get a whole load of errors and the resultant models
> are empty (64 bytes for the reordering model, 0 bytes for the translation
> model - the language model builds fine).
>
> The first "errors" I can see in the log seem to occur on stage 4 of the
> Moses training script (train-model.perl):
>
>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018
>(/opt/model-builder/training/data.ru
> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
>!Argument "anna" isn't numeric in numeric ge (>=) at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 112,  line 1.
>Use of uninitialized value $ei in numeric ge (>=) at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 112,  line 1.
>Use of uninitialized value $ei in hash element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 118,  line 1.
>Use of uninitialized value $ei in array element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 121,  line 1.
>Use of uninitialized value $ei in array element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 123,  line 1.
>...
>
> There are a large number of errors of that nature, and following those
> errors there are additional errors but I suspect these are caused by the
> fact that this stage 

Re: [Moses-support] Issues running MGiza on AWS machine

2018-08-01 Thread Hieu Hoang
I'm out of ideas.

I've used aws and azure many times so it should work

Hieu Hoang
Sent while bumping into things

On Wed, 1 Aug 2018, 20:01 James Baker,  wrote:

> Alas, nothing erroneous that I can see in the logs (using
> ./train-model.perl > output.log 2>&1), and neither the memory usage nor the
> used disk space went over 10% during the training.
>
> James
>
> On Wed, 1 Aug 2018 at 08:56, Hieu Hoang  wrote:
>
>> redirect stdout and stderr into a file and grep for 'error'
>>
>> that usually turns up something
>>
>> Hieu Hoang
>> http://statmt.org/hieu
>>
>> On 1 August 2018 at 17:38, James Baker  wrote:
>>
>>> Thanks Hieu,
>>>
>>> I'll give that a go this morning and keep an eye on the disk space and
>>> RAM, although I would be surprised if that was the problem (I've got <3GB
>>> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't
>>> explain why binaries built on a different machine work, but binaries built
>>> on the same machine don't.
>>>
>>> Any other ideas for things I should be checking?
>>>
>>> Cheers,
>>> James
>>>
>>> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang  wrote:
>>>
 it's difficult to tell but I would say the mgiza executables isn't the
 problem. It's probably to do with running out of disk space or memory.

 the snt2coooc executable in mgiza uses a lot of memory so may have been
 killed by the OS. The phrase table creation requires a lot of disk space to
 sort intermediate files.

 I would monitor those 2 things

 Hieu Hoang
 http://statmt.org/hieu

 On 31 July 2018 at 20:41, James Baker  wrote:

> Hi,
>
> I'm having some peculiar issues with MGiza++. Using MGiza and Moses,
> I've successfully built some translation models on my Ubuntu 16.04 desktop
> machine. I'd now like to do the same thing, but on a machine hosted in 
> AWS.
>
> I'm using the same operating system, and as far as I can tell all my
> versions are identical. The build of MGiza++ runs fine, reports no errors,
> and produces output the same as on my desktop machine. However, when I try
> to build the models, I get a whole load of errors and the resultant models
> are empty (64 bytes for the reordering model, 0 bytes for the translation
> model - the language model builds fine).
>
> The first "errors" I can see in the log seem to occur on stage 4 of
> the Moses training script (train-model.perl):
>
>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58
> UTC 2018
>(/opt/model-builder/training/data.ru
> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
>!Argument "anna" isn't numeric in numeric ge (>=) at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 112,  line 1.
>Use of uninitialized value $ei in numeric ge (>=) at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 112,  line 1.
>Use of uninitialized value $ei in hash element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 118,  line 1.
>Use of uninitialized value $ei in array element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 121,  line 1.
>Use of uninitialized value $ei in array element at
> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> line 123,  line 1.
>...
>
> There are a large number of errors of that nature, and following those
> errors there are additional errors but I suspect these are caused by the
> fact that this stage is failing.
>
> It's possible that there are earlier problems, but I'm not really sure
> what to be looking for in the logs (for instance - there are some lines
> warning about alignments in Model2 being 0 - is that an issue?).
>
> If I replace the MGiza binaries built on the AWS machine with the
> binaries built on my desktop, it runs fine - so I know it's an issue with
> MGiza and presumably something to do with my build. The commands I'm
> running to build and install are as follows
>
>git clone https://github.com/moses-smt/mgiza.git
>cd mgiza/mgizapp
>cmake .
>make
>make install
>cp bin/* ../../mosesdecoder/bin
>cp scripts/merge_alignment.py ../../mosesdecoder/bin
>
> As I mentioned previously, these commands work fine on my desktop
> machine which should be a very similar (if not identical) set up.
>
> Does anyone have any ideas as to what might be causing the problem
> (or, more importantly, what I can do to fix it)?
>
> Thanks in advance,
> James
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses

Re: [Moses-support] Issues running MGiza on AWS machine

2018-08-01 Thread James Baker
Alas, nothing erroneous that I can see in the logs (using
./train-model.perl > output.log 2>&1), and neither the memory usage nor the
used disk space went over 10% during the training.

James

On Wed, 1 Aug 2018 at 08:56, Hieu Hoang  wrote:

> redirect stdout and stderr into a file and grep for 'error'
>
> that usually turns up something
>
> Hieu Hoang
> http://statmt.org/hieu
>
> On 1 August 2018 at 17:38, James Baker  wrote:
>
>> Thanks Hieu,
>>
>> I'll give that a go this morning and keep an eye on the disk space and
>> RAM, although I would be surprised if that was the problem (I've got <3GB
>> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't
>> explain why binaries built on a different machine work, but binaries built
>> on the same machine don't.
>>
>> Any other ideas for things I should be checking?
>>
>> Cheers,
>> James
>>
>> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang  wrote:
>>
>>> it's difficult to tell but I would say the mgiza executables isn't the
>>> problem. It's probably to do with running out of disk space or memory.
>>>
>>> the snt2coooc executable in mgiza uses a lot of memory so may have been
>>> killed by the OS. The phrase table creation requires a lot of disk space to
>>> sort intermediate files.
>>>
>>> I would monitor those 2 things
>>>
>>> Hieu Hoang
>>> http://statmt.org/hieu
>>>
>>> On 31 July 2018 at 20:41, James Baker  wrote:
>>>
 Hi,

 I'm having some peculiar issues with MGiza++. Using MGiza and Moses,
 I've successfully built some translation models on my Ubuntu 16.04 desktop
 machine. I'd now like to do the same thing, but on a machine hosted in AWS.

 I'm using the same operating system, and as far as I can tell all my
 versions are identical. The build of MGiza++ runs fine, reports no errors,
 and produces output the same as on my desktop machine. However, when I try
 to build the models, I get a whole load of errors and the resultant models
 are empty (64 bytes for the reordering model, 0 bytes for the translation
 model - the language model builds fine).

 The first "errors" I can see in the log seem to occur on stage 4 of the
 Moses training script (train-model.perl):

(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC
 2018
(/opt/model-builder/training/data.ru
 ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
!Argument "anna" isn't numeric in numeric ge (>=) at
 /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
 line 112,  line 1.
Use of uninitialized value $ei in numeric ge (>=) at
 /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
 line 112,  line 1.
Use of uninitialized value $ei in hash element at
 /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
 line 118,  line 1.
Use of uninitialized value $ei in array element at
 /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
 line 121,  line 1.
Use of uninitialized value $ei in array element at
 /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
 line 123,  line 1.
...

 There are a large number of errors of that nature, and following those
 errors there are additional errors but I suspect these are caused by the
 fact that this stage is failing.

 It's possible that there are earlier problems, but I'm not really sure
 what to be looking for in the logs (for instance - there are some lines
 warning about alignments in Model2 being 0 - is that an issue?).

 If I replace the MGiza binaries built on the AWS machine with the
 binaries built on my desktop, it runs fine - so I know it's an issue with
 MGiza and presumably something to do with my build. The commands I'm
 running to build and install are as follows

git clone https://github.com/moses-smt/mgiza.git
cd mgiza/mgizapp
cmake .
make
make install
cp bin/* ../../mosesdecoder/bin
cp scripts/merge_alignment.py ../../mosesdecoder/bin

 As I mentioned previously, these commands work fine on my desktop
 machine which should be a very similar (if not identical) set up.

 Does anyone have any ideas as to what might be causing the problem (or,
 more importantly, what I can do to fix it)?

 Thanks in advance,
 James

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


>>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Issues running MGiza on AWS machine

2018-08-01 Thread Hieu Hoang
redirect stdout and stderr into a file and grep for 'error'

that usually turns up something

Hieu Hoang
http://statmt.org/hieu

On 1 August 2018 at 17:38, James Baker  wrote:

> Thanks Hieu,
>
> I'll give that a go this morning and keep an eye on the disk space and
> RAM, although I would be surprised if that was the problem (I've got <3GB
> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't
> explain why binaries built on a different machine work, but binaries built
> on the same machine don't.
>
> Any other ideas for things I should be checking?
>
> Cheers,
> James
>
> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang  wrote:
>
>> it's difficult to tell but I would say the mgiza executables isn't the
>> problem. It's probably to do with running out of disk space or memory.
>>
>> the snt2coooc executable in mgiza uses a lot of memory so may have been
>> killed by the OS. The phrase table creation requires a lot of disk space to
>> sort intermediate files.
>>
>> I would monitor those 2 things
>>
>> Hieu Hoang
>> http://statmt.org/hieu
>>
>> On 31 July 2018 at 20:41, James Baker  wrote:
>>
>>> Hi,
>>>
>>> I'm having some peculiar issues with MGiza++. Using MGiza and Moses,
>>> I've successfully built some translation models on my Ubuntu 16.04 desktop
>>> machine. I'd now like to do the same thing, but on a machine hosted in AWS.
>>>
>>> I'm using the same operating system, and as far as I can tell all my
>>> versions are identical. The build of MGiza++ runs fine, reports no errors,
>>> and produces output the same as on my desktop machine. However, when I try
>>> to build the models, I get a whole load of errors and the resultant models
>>> are empty (64 bytes for the reordering model, 0 bytes for the translation
>>> model - the language model builds fine).
>>>
>>> The first "errors" I can see in the log seem to occur on stage 4 of the
>>> Moses training script (train-model.perl):
>>>
>>>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC
>>> 2018
>>>(/opt/model-builder/training/data.ru,/opt/model-builder/
>>> training/data.en,/opt/model-builder/training/model/lex)
>>>!Argument "anna" isn't numeric in numeric ge (>=) at
>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>> line 112,  line 1.
>>>Use of uninitialized value $ei in numeric ge (>=) at
>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>> line 112,  line 1.
>>>Use of uninitialized value $ei in hash element at /opt/model-builder/
>>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, 
>>> line 1.
>>>Use of uninitialized value $ei in array element at /opt/model-builder/
>>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, 
>>> line 1.
>>>Use of uninitialized value $ei in array element at /opt/model-builder/
>>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, 
>>> line 1.
>>>...
>>>
>>> There are a large number of errors of that nature, and following those
>>> errors there are additional errors but I suspect these are caused by the
>>> fact that this stage is failing.
>>>
>>> It's possible that there are earlier problems, but I'm not really sure
>>> what to be looking for in the logs (for instance - there are some lines
>>> warning about alignments in Model2 being 0 - is that an issue?).
>>>
>>> If I replace the MGiza binaries built on the AWS machine with the
>>> binaries built on my desktop, it runs fine - so I know it's an issue with
>>> MGiza and presumably something to do with my build. The commands I'm
>>> running to build and install are as follows
>>>
>>>git clone https://github.com/moses-smt/mgiza.git
>>>cd mgiza/mgizapp
>>>cmake .
>>>make
>>>make install
>>>cp bin/* ../../mosesdecoder/bin
>>>cp scripts/merge_alignment.py ../../mosesdecoder/bin
>>>
>>> As I mentioned previously, these commands work fine on my desktop
>>> machine which should be a very similar (if not identical) set up.
>>>
>>> Does anyone have any ideas as to what might be causing the problem (or,
>>> more importantly, what I can do to fix it)?
>>>
>>> Thanks in advance,
>>> James
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Issues running MGiza on AWS machine

2018-08-01 Thread James Baker
Thanks Hieu,

I'll give that a go this morning and keep an eye on the disk space and RAM,
although I would be surprised if that was the problem (I've got <3GB of
training data, 64GB of RAM, and 100GB of disk space). It also wouldn't
explain why binaries built on a different machine work, but binaries built
on the same machine don't.

Any other ideas for things I should be checking?

Cheers,
James

On Wed, 1 Aug 2018 at 03:03, Hieu Hoang  wrote:

> it's difficult to tell but I would say the mgiza executables isn't the
> problem. It's probably to do with running out of disk space or memory.
>
> the snt2coooc executable in mgiza uses a lot of memory so may have been
> killed by the OS. The phrase table creation requires a lot of disk space to
> sort intermediate files.
>
> I would monitor those 2 things
>
> Hieu Hoang
> http://statmt.org/hieu
>
> On 31 July 2018 at 20:41, James Baker  wrote:
>
>> Hi,
>>
>> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
>> successfully built some translation models on my Ubuntu 16.04 desktop
>> machine. I'd now like to do the same thing, but on a machine hosted in AWS.
>>
>> I'm using the same operating system, and as far as I can tell all my
>> versions are identical. The build of MGiza++ runs fine, reports no errors,
>> and produces output the same as on my desktop machine. However, when I try
>> to build the models, I get a whole load of errors and the resultant models
>> are empty (64 bytes for the reordering model, 0 bytes for the translation
>> model - the language model builds fine).
>>
>> The first "errors" I can see in the log seem to occur on stage 4 of the
>> Moses training script (train-model.perl):
>>
>>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC
>> 2018
>>(/opt/model-builder/training/data.ru
>> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
>>!Argument "anna" isn't numeric in numeric ge (>=) at
>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>> line 112,  line 1.
>>Use of uninitialized value $ei in numeric ge (>=) at
>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>> line 112,  line 1.
>>Use of uninitialized value $ei in hash element at
>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>> line 118,  line 1.
>>Use of uninitialized value $ei in array element at
>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>> line 121,  line 1.
>>Use of uninitialized value $ei in array element at
>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>> line 123,  line 1.
>>...
>>
>> There are a large number of errors of that nature, and following those
>> errors there are additional errors but I suspect these are caused by the
>> fact that this stage is failing.
>>
>> It's possible that there are earlier problems, but I'm not really sure
>> what to be looking for in the logs (for instance - there are some lines
>> warning about alignments in Model2 being 0 - is that an issue?).
>>
>> If I replace the MGiza binaries built on the AWS machine with the
>> binaries built on my desktop, it runs fine - so I know it's an issue with
>> MGiza and presumably something to do with my build. The commands I'm
>> running to build and install are as follows
>>
>>git clone https://github.com/moses-smt/mgiza.git
>>cd mgiza/mgizapp
>>cmake .
>>make
>>make install
>>cp bin/* ../../mosesdecoder/bin
>>cp scripts/merge_alignment.py ../../mosesdecoder/bin
>>
>> As I mentioned previously, these commands work fine on my desktop machine
>> which should be a very similar (if not identical) set up.
>>
>> Does anyone have any ideas as to what might be causing the problem (or,
>> more importantly, what I can do to fix it)?
>>
>> Thanks in advance,
>> James
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Issues running MGiza on AWS machine

2018-07-31 Thread Hieu Hoang
it's difficult to tell but I would say the mgiza executables isn't the
problem. It's probably to do with running out of disk space or memory.

the snt2coooc executable in mgiza uses a lot of memory so may have been
killed by the OS. The phrase table creation requires a lot of disk space to
sort intermediate files.

I would monitor those 2 things

Hieu Hoang
http://statmt.org/hieu

On 31 July 2018 at 20:41, James Baker  wrote:

> Hi,
>
> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
> successfully built some translation models on my Ubuntu 16.04 desktop
> machine. I'd now like to do the same thing, but on a machine hosted in AWS.
>
> I'm using the same operating system, and as far as I can tell all my
> versions are identical. The build of MGiza++ runs fine, reports no errors,
> and produces output the same as on my desktop machine. However, when I try
> to build the models, I get a whole load of errors and the resultant models
> are empty (64 bytes for the reordering model, 0 bytes for the translation
> model - the language model builds fine).
>
> The first "errors" I can see in the log seem to occur on stage 4 of the
> Moses training script (train-model.perl):
>
>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC
> 2018
>(/opt/model-builder/training/data.ru,/opt/model-builder/
> training/data.en,/opt/model-builder/training/model/lex)
>!Argument "anna" isn't numeric in numeric ge (>=) at /opt/model-builder/
> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, 
> line 1.
>Use of uninitialized value $ei in numeric ge (>=) at /opt/model-builder/
> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, 
> line 1.
>Use of uninitialized value $ei in hash element at /opt/model-builder/
> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, 
> line 1.
>Use of uninitialized value $ei in array element at /opt/model-builder/
> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, 
> line 1.
>Use of uninitialized value $ei in array element at /opt/model-builder/
> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, 
> line 1.
>...
>
> There are a large number of errors of that nature, and following those
> errors there are additional errors but I suspect these are caused by the
> fact that this stage is failing.
>
> It's possible that there are earlier problems, but I'm not really sure
> what to be looking for in the logs (for instance - there are some lines
> warning about alignments in Model2 being 0 - is that an issue?).
>
> If I replace the MGiza binaries built on the AWS machine with the binaries
> built on my desktop, it runs fine - so I know it's an issue with MGiza and
> presumably something to do with my build. The commands I'm running to build
> and install are as follows
>
>git clone https://github.com/moses-smt/mgiza.git
>cd mgiza/mgizapp
>cmake .
>make
>make install
>cp bin/* ../../mosesdecoder/bin
>cp scripts/merge_alignment.py ../../mosesdecoder/bin
>
> As I mentioned previously, these commands work fine on my desktop machine
> which should be a very similar (if not identical) set up.
>
> Does anyone have any ideas as to what might be causing the problem (or,
> more importantly, what I can do to fix it)?
>
> Thanks in advance,
> James
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Issues running MGiza on AWS machine

2018-07-31 Thread Tom Hoar

Hi James,

Since train-model.perl fails at step 4 fails with the MGIZA binaries you 
build on your AWS machine, but succeeds when you copy MGIZA binaries 
that you built on your local Ubuntu 16.04 machine, do the build logs 
show a missing dependency?


My next question, why don't you just use the binaries that work? It 
seems like the AWS machine's Ubuntu distro is missing dependencies and 
the MGIZA++ build failed. Check those build logs.


If you want to troubleshoot deeper, you need to backtrack from step 4. 
The train-model.perl step 4 uses the output from step 3, i.e. the word 
alignment file. Check if that word alignment file is corrupted.


Then check step 3, its inputs are the GIZA alignment files output in 
step 2. This step uses the symal binary executable. Make sure you're 
using the Moses version of symal, not the one in the MGIZA library.

http://article.gmane.org/gmane.comp.nlp.moses.user/11544
http://moses-support.mit.narkive.com/KpKC2TQn/which-symal

Backtracking to step 2, log lines with the following text messages 
should cause the mgiza executable to terminate but it doesn't. The 
parallel forks in train-model.perl mask the failure, processing 
continues and you experience ambiguous failures downstream.


   ERROR: A SOURCE or TARGET sentence has a zero-length sentence.
   ERROR! DUPLICATED ENTRY
   WARNING: The following sentence pair has source/target sentence
   length ration more than

There are rarely errors in Step 1, but if you are experiencing a compile 
error on AWS, those MGIZA binaries in step 1 could be the cause.


Also, the C++ binary executables are not the only things that change 
when you use use the alternate build. If you also copied the 
merge_alignment.py, this could be a problem in train-model.perl step 2. 
Make sure the AWS build has this in the right place and that it runs on 
the AWS distro's Python interpreter.


Tom



On 7/31/2018 11:01 PM, moses-support-requ...@mit.edu wrote:

Date: Tue, 31 Jul 2018 11:41:01 +0100
From: James Baker
Subject: [Moses-support] Issues running MGiza on AWS machine
To:moses-support@mit.edu
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hi,

I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
successfully built some translation models on my Ubuntu 16.04 desktop
machine. I'd now like to do the same thing, but on a machine hosted in AWS.

I'm using the same operating system, and as far as I can tell all my
versions are identical. The build of MGiza++ runs fine, reports no errors,
and produces output the same as on my desktop machine. However, when I try
to build the models, I get a whole load of errors and the resultant models
are empty (64 bytes for the reordering model, 0 bytes for the translation
model - the language model builds fine).

The first "errors" I can see in the log seem to occur on stage 4 of the
Moses training script (train-model.perl):

(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018
(/opt/model-builder/training/data.ru
,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
!Argument "anna" isn't numeric in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112,  line 1.
Use of uninitialized value $ei in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112,  line 1.
Use of uninitialized value $ei in hash element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 118,  line 1.
Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 121,  line 1.
Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 123,  line 1.
...

There are a large number of errors of that nature, and following those
errors there are additional errors but I suspect these are caused by the
fact that this stage is failing.

It's possible that there are earlier problems, but I'm not really sure what
to be looking for in the logs (for instance - there are some lines warning
about alignments in Model2 being 0 - is that an issue?).

If I replace the MGiza binaries built on the AWS machine with the binaries
built on my desktop, it runs fine - so I know it's an issue with MGiza and
presumably something to do with my build. The commands I'm running to build
and install are as follows

git clonehttps://github.com/moses-smt/mgiza.git
cd mgiza/mgizapp
cmake .
make
make install
cp bin/* ../../mosesdecoder/bin
cp scripts/merge_alignment.py ../../mosesdecoder/bin

As I mentioned previously, these commands work fine on my desktop machine
which should be a very similar (if not identical) set up.

Does anyone have any ideas as to wh

[Moses-support] Issues running MGiza on AWS machine

2018-07-31 Thread James Baker
Hi,

I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
successfully built some translation models on my Ubuntu 16.04 desktop
machine. I'd now like to do the same thing, but on a machine hosted in AWS.

I'm using the same operating system, and as far as I can tell all my
versions are identical. The build of MGiza++ runs fine, reports no errors,
and produces output the same as on my desktop machine. However, when I try
to build the models, I get a whole load of errors and the resultant models
are empty (64 bytes for the reordering model, 0 bytes for the translation
model - the language model builds fine).

The first "errors" I can see in the log seem to occur on stage 4 of the
Moses training script (train-model.perl):

   (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018
   (/opt/model-builder/training/data.ru
,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
   !Argument "anna" isn't numeric in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112,  line 1.
   Use of uninitialized value $ei in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112,  line 1.
   Use of uninitialized value $ei in hash element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 118,  line 1.
   Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 121,  line 1.
   Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 123,  line 1.
   ...

There are a large number of errors of that nature, and following those
errors there are additional errors but I suspect these are caused by the
fact that this stage is failing.

It's possible that there are earlier problems, but I'm not really sure what
to be looking for in the logs (for instance - there are some lines warning
about alignments in Model2 being 0 - is that an issue?).

If I replace the MGiza binaries built on the AWS machine with the binaries
built on my desktop, it runs fine - so I know it's an issue with MGiza and
presumably something to do with my build. The commands I'm running to build
and install are as follows

   git clone https://github.com/moses-smt/mgiza.git
   cd mgiza/mgizapp
   cmake .
   make
   make install
   cp bin/* ../../mosesdecoder/bin
   cp scripts/merge_alignment.py ../../mosesdecoder/bin

As I mentioned previously, these commands work fine on my desktop machine
which should be a very similar (if not identical) set up.

Does anyone have any ideas as to what might be causing the problem (or,
more importantly, what I can do to fix it)?

Thanks in advance,
James
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support