Re: [Moses-support] Issues running MGiza on AWS machine
Thanks Tom - using the Moses version of symal rather than the MGiza version fixed it (although still not sure why it should be different to the one I built on my desktop). I hadn't realised they were different, as the instructions on the Moses website state you should copy all binaries from MGiza into the Moses directory: http://www.statmt.org/moses/?n=Moses.ExternalTools#ntoc3 Thanks for your help. James On Wed, 1 Aug 2018 at 02:14, Tom Hoar wrote: > Hi James, > > Since train-model.perl fails at step 4 fails with the MGIZA binaries you > build on your AWS machine, but succeeds when you copy MGIZA binaries that > you built on your local Ubuntu 16.04 machine, do the build logs show a > missing dependency? > > My next question, why don't you just use the binaries that work? It seems > like the AWS machine's Ubuntu distro is missing dependencies and the > MGIZA++ build failed. Check those build logs. > > If you want to troubleshoot deeper, you need to backtrack from step 4. The > train-model.perl step 4 uses the output from step 3, i.e. the word > alignment file. Check if that word alignment file is corrupted. > > Then check step 3, its inputs are the GIZA alignment files output in step > 2. This step uses the symal binary executable. Make sure you're using the > Moses version of symal, not the one in the MGIZA library. > http://article.gmane.org/gmane.comp.nlp.moses.user/11544 > http://moses-support.mit.narkive.com/KpKC2TQn/which-symal > > Backtracking to step 2, log lines with the following text messages should > cause the mgiza executable to terminate but it doesn't. The parallel forks > in train-model.perl mask the failure, processing continues and you > experience ambiguous failures downstream. > > ERROR: A SOURCE or TARGET sentence has a zero-length sentence. > ERROR! DUPLICATED ENTRY > WARNING: The following sentence pair has source/target sentence length > ration more than > > There are rarely errors in Step 1, but if you are experiencing a compile > error on AWS, those MGIZA binaries in step 1 could be the cause. > > Also, the C++ binary executables are not the only things that change when > you use use the alternate build. If you also copied the merge_alignment.py, > this could be a problem in train-model.perl step 2. Make sure the AWS build > has this in the right place and that it runs on the AWS distro's Python > interpreter. > > Tom > > > > On 7/31/2018 11:01 PM, moses-support-requ...@mit.edu wrote: > > Date: Tue, 31 Jul 2018 11:41:01 +0100 > From: James Baker > Subject: [Moses-support] Issues running MGiza on AWS machine > To: moses-support@mit.edu > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > Hi, > > I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've > successfully built some translation models on my Ubuntu 16.04 desktop > machine. I'd now like to do the same thing, but on a machine hosted in AWS. > > I'm using the same operating system, and as far as I can tell all my > versions are identical. The build of MGiza++ runs fine, reports no errors, > and produces output the same as on my desktop machine. However, when I try > to build the models, I get a whole load of errors and the resultant models > are empty (64 bytes for the reordering model, 0 bytes for the translation > model - the language model builds fine). > > The first "errors" I can see in the log seem to occur on stage 4 of the > Moses training script (train-model.perl): > >(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018 >(/opt/model-builder/training/data.ru > ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) >!Argument "anna" isn't numeric in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, line 1. >Use of uninitialized value $ei in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, line 1. >Use of uninitialized value $ei in hash element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 118, line 1. >Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 121, line 1. >Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 123, line 1. >... > > There are a large number of errors of that nature, and following those > errors there are additional errors but I suspect these are caused by the > fact that this stage
Re: [Moses-support] Issues running MGiza on AWS machine
I'm out of ideas. I've used aws and azure many times so it should work Hieu Hoang Sent while bumping into things On Wed, 1 Aug 2018, 20:01 James Baker, wrote: > Alas, nothing erroneous that I can see in the logs (using > ./train-model.perl > output.log 2>&1), and neither the memory usage nor the > used disk space went over 10% during the training. > > James > > On Wed, 1 Aug 2018 at 08:56, Hieu Hoang wrote: > >> redirect stdout and stderr into a file and grep for 'error' >> >> that usually turns up something >> >> Hieu Hoang >> http://statmt.org/hieu >> >> On 1 August 2018 at 17:38, James Baker wrote: >> >>> Thanks Hieu, >>> >>> I'll give that a go this morning and keep an eye on the disk space and >>> RAM, although I would be surprised if that was the problem (I've got <3GB >>> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't >>> explain why binaries built on a different machine work, but binaries built >>> on the same machine don't. >>> >>> Any other ideas for things I should be checking? >>> >>> Cheers, >>> James >>> >>> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang wrote: >>> it's difficult to tell but I would say the mgiza executables isn't the problem. It's probably to do with running out of disk space or memory. the snt2coooc executable in mgiza uses a lot of memory so may have been killed by the OS. The phrase table creation requires a lot of disk space to sort intermediate files. I would monitor those 2 things Hieu Hoang http://statmt.org/hieu On 31 July 2018 at 20:41, James Baker wrote: > Hi, > > I'm having some peculiar issues with MGiza++. Using MGiza and Moses, > I've successfully built some translation models on my Ubuntu 16.04 desktop > machine. I'd now like to do the same thing, but on a machine hosted in > AWS. > > I'm using the same operating system, and as far as I can tell all my > versions are identical. The build of MGiza++ runs fine, reports no errors, > and produces output the same as on my desktop machine. However, when I try > to build the models, I get a whole load of errors and the resultant models > are empty (64 bytes for the reordering model, 0 bytes for the translation > model - the language model builds fine). > > The first "errors" I can see in the log seem to occur on stage 4 of > the Moses training script (train-model.perl): > >(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 > UTC 2018 >(/opt/model-builder/training/data.ru > ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) >!Argument "anna" isn't numeric in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, line 1. >Use of uninitialized value $ei in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, line 1. >Use of uninitialized value $ei in hash element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 118, line 1. >Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 121, line 1. >Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 123, line 1. >... > > There are a large number of errors of that nature, and following those > errors there are additional errors but I suspect these are caused by the > fact that this stage is failing. > > It's possible that there are earlier problems, but I'm not really sure > what to be looking for in the logs (for instance - there are some lines > warning about alignments in Model2 being 0 - is that an issue?). > > If I replace the MGiza binaries built on the AWS machine with the > binaries built on my desktop, it runs fine - so I know it's an issue with > MGiza and presumably something to do with my build. The commands I'm > running to build and install are as follows > >git clone https://github.com/moses-smt/mgiza.git >cd mgiza/mgizapp >cmake . >make >make install >cp bin/* ../../mosesdecoder/bin >cp scripts/merge_alignment.py ../../mosesdecoder/bin > > As I mentioned previously, these commands work fine on my desktop > machine which should be a very similar (if not identical) set up. > > Does anyone have any ideas as to what might be causing the problem > (or, more importantly, what I can do to fix it)? > > Thanks in advance, > James > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses
Re: [Moses-support] Issues running MGiza on AWS machine
Alas, nothing erroneous that I can see in the logs (using ./train-model.perl > output.log 2>&1), and neither the memory usage nor the used disk space went over 10% during the training. James On Wed, 1 Aug 2018 at 08:56, Hieu Hoang wrote: > redirect stdout and stderr into a file and grep for 'error' > > that usually turns up something > > Hieu Hoang > http://statmt.org/hieu > > On 1 August 2018 at 17:38, James Baker wrote: > >> Thanks Hieu, >> >> I'll give that a go this morning and keep an eye on the disk space and >> RAM, although I would be surprised if that was the problem (I've got <3GB >> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't >> explain why binaries built on a different machine work, but binaries built >> on the same machine don't. >> >> Any other ideas for things I should be checking? >> >> Cheers, >> James >> >> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang wrote: >> >>> it's difficult to tell but I would say the mgiza executables isn't the >>> problem. It's probably to do with running out of disk space or memory. >>> >>> the snt2coooc executable in mgiza uses a lot of memory so may have been >>> killed by the OS. The phrase table creation requires a lot of disk space to >>> sort intermediate files. >>> >>> I would monitor those 2 things >>> >>> Hieu Hoang >>> http://statmt.org/hieu >>> >>> On 31 July 2018 at 20:41, James Baker wrote: >>> Hi, I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've successfully built some translation models on my Ubuntu 16.04 desktop machine. I'd now like to do the same thing, but on a machine hosted in AWS. I'm using the same operating system, and as far as I can tell all my versions are identical. The build of MGiza++ runs fine, reports no errors, and produces output the same as on my desktop machine. However, when I try to build the models, I get a whole load of errors and the resultant models are empty (64 bytes for the reordering model, 0 bytes for the translation model - the language model builds fine). The first "errors" I can see in the log seem to occur on stage 4 of the Moses training script (train-model.perl): (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018 (/opt/model-builder/training/data.ru ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) !Argument "anna" isn't numeric in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in hash element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, line 1. ... There are a large number of errors of that nature, and following those errors there are additional errors but I suspect these are caused by the fact that this stage is failing. It's possible that there are earlier problems, but I'm not really sure what to be looking for in the logs (for instance - there are some lines warning about alignments in Model2 being 0 - is that an issue?). If I replace the MGiza binaries built on the AWS machine with the binaries built on my desktop, it runs fine - so I know it's an issue with MGiza and presumably something to do with my build. The commands I'm running to build and install are as follows git clone https://github.com/moses-smt/mgiza.git cd mgiza/mgizapp cmake . make make install cp bin/* ../../mosesdecoder/bin cp scripts/merge_alignment.py ../../mosesdecoder/bin As I mentioned previously, these commands work fine on my desktop machine which should be a very similar (if not identical) set up. Does anyone have any ideas as to what might be causing the problem (or, more importantly, what I can do to fix it)? Thanks in advance, James ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support >>> > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Issues running MGiza on AWS machine
redirect stdout and stderr into a file and grep for 'error' that usually turns up something Hieu Hoang http://statmt.org/hieu On 1 August 2018 at 17:38, James Baker wrote: > Thanks Hieu, > > I'll give that a go this morning and keep an eye on the disk space and > RAM, although I would be surprised if that was the problem (I've got <3GB > of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't > explain why binaries built on a different machine work, but binaries built > on the same machine don't. > > Any other ideas for things I should be checking? > > Cheers, > James > > On Wed, 1 Aug 2018 at 03:03, Hieu Hoang wrote: > >> it's difficult to tell but I would say the mgiza executables isn't the >> problem. It's probably to do with running out of disk space or memory. >> >> the snt2coooc executable in mgiza uses a lot of memory so may have been >> killed by the OS. The phrase table creation requires a lot of disk space to >> sort intermediate files. >> >> I would monitor those 2 things >> >> Hieu Hoang >> http://statmt.org/hieu >> >> On 31 July 2018 at 20:41, James Baker wrote: >> >>> Hi, >>> >>> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, >>> I've successfully built some translation models on my Ubuntu 16.04 desktop >>> machine. I'd now like to do the same thing, but on a machine hosted in AWS. >>> >>> I'm using the same operating system, and as far as I can tell all my >>> versions are identical. The build of MGiza++ runs fine, reports no errors, >>> and produces output the same as on my desktop machine. However, when I try >>> to build the models, I get a whole load of errors and the resultant models >>> are empty (64 bytes for the reordering model, 0 bytes for the translation >>> model - the language model builds fine). >>> >>> The first "errors" I can see in the log seem to occur on stage 4 of the >>> Moses training script (train-model.perl): >>> >>>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC >>> 2018 >>>(/opt/model-builder/training/data.ru,/opt/model-builder/ >>> training/data.en,/opt/model-builder/training/model/lex) >>>!Argument "anna" isn't numeric in numeric ge (>=) at >>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>> line 112, line 1. >>>Use of uninitialized value $ei in numeric ge (>=) at >>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>> line 112, line 1. >>>Use of uninitialized value $ei in hash element at /opt/model-builder/ >>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, >>> line 1. >>>Use of uninitialized value $ei in array element at /opt/model-builder/ >>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, >>> line 1. >>>Use of uninitialized value $ei in array element at /opt/model-builder/ >>> mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, >>> line 1. >>>... >>> >>> There are a large number of errors of that nature, and following those >>> errors there are additional errors but I suspect these are caused by the >>> fact that this stage is failing. >>> >>> It's possible that there are earlier problems, but I'm not really sure >>> what to be looking for in the logs (for instance - there are some lines >>> warning about alignments in Model2 being 0 - is that an issue?). >>> >>> If I replace the MGiza binaries built on the AWS machine with the >>> binaries built on my desktop, it runs fine - so I know it's an issue with >>> MGiza and presumably something to do with my build. The commands I'm >>> running to build and install are as follows >>> >>>git clone https://github.com/moses-smt/mgiza.git >>>cd mgiza/mgizapp >>>cmake . >>>make >>>make install >>>cp bin/* ../../mosesdecoder/bin >>>cp scripts/merge_alignment.py ../../mosesdecoder/bin >>> >>> As I mentioned previously, these commands work fine on my desktop >>> machine which should be a very similar (if not identical) set up. >>> >>> Does anyone have any ideas as to what might be causing the problem (or, >>> more importantly, what I can do to fix it)? >>> >>> Thanks in advance, >>> James >>> >>> ___ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >> ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Issues running MGiza on AWS machine
Thanks Hieu, I'll give that a go this morning and keep an eye on the disk space and RAM, although I would be surprised if that was the problem (I've got <3GB of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't explain why binaries built on a different machine work, but binaries built on the same machine don't. Any other ideas for things I should be checking? Cheers, James On Wed, 1 Aug 2018 at 03:03, Hieu Hoang wrote: > it's difficult to tell but I would say the mgiza executables isn't the > problem. It's probably to do with running out of disk space or memory. > > the snt2coooc executable in mgiza uses a lot of memory so may have been > killed by the OS. The phrase table creation requires a lot of disk space to > sort intermediate files. > > I would monitor those 2 things > > Hieu Hoang > http://statmt.org/hieu > > On 31 July 2018 at 20:41, James Baker wrote: > >> Hi, >> >> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've >> successfully built some translation models on my Ubuntu 16.04 desktop >> machine. I'd now like to do the same thing, but on a machine hosted in AWS. >> >> I'm using the same operating system, and as far as I can tell all my >> versions are identical. The build of MGiza++ runs fine, reports no errors, >> and produces output the same as on my desktop machine. However, when I try >> to build the models, I get a whole load of errors and the resultant models >> are empty (64 bytes for the reordering model, 0 bytes for the translation >> model - the language model builds fine). >> >> The first "errors" I can see in the log seem to occur on stage 4 of the >> Moses training script (train-model.perl): >> >>(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC >> 2018 >>(/opt/model-builder/training/data.ru >> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) >>!Argument "anna" isn't numeric in numeric ge (>=) at >> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >> line 112, line 1. >>Use of uninitialized value $ei in numeric ge (>=) at >> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >> line 112, line 1. >>Use of uninitialized value $ei in hash element at >> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >> line 118, line 1. >>Use of uninitialized value $ei in array element at >> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >> line 121, line 1. >>Use of uninitialized value $ei in array element at >> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >> line 123, line 1. >>... >> >> There are a large number of errors of that nature, and following those >> errors there are additional errors but I suspect these are caused by the >> fact that this stage is failing. >> >> It's possible that there are earlier problems, but I'm not really sure >> what to be looking for in the logs (for instance - there are some lines >> warning about alignments in Model2 being 0 - is that an issue?). >> >> If I replace the MGiza binaries built on the AWS machine with the >> binaries built on my desktop, it runs fine - so I know it's an issue with >> MGiza and presumably something to do with my build. The commands I'm >> running to build and install are as follows >> >>git clone https://github.com/moses-smt/mgiza.git >>cd mgiza/mgizapp >>cmake . >>make >>make install >>cp bin/* ../../mosesdecoder/bin >>cp scripts/merge_alignment.py ../../mosesdecoder/bin >> >> As I mentioned previously, these commands work fine on my desktop machine >> which should be a very similar (if not identical) set up. >> >> Does anyone have any ideas as to what might be causing the problem (or, >> more importantly, what I can do to fix it)? >> >> Thanks in advance, >> James >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Issues running MGiza on AWS machine
it's difficult to tell but I would say the mgiza executables isn't the problem. It's probably to do with running out of disk space or memory. the snt2coooc executable in mgiza uses a lot of memory so may have been killed by the OS. The phrase table creation requires a lot of disk space to sort intermediate files. I would monitor those 2 things Hieu Hoang http://statmt.org/hieu On 31 July 2018 at 20:41, James Baker wrote: > Hi, > > I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've > successfully built some translation models on my Ubuntu 16.04 desktop > machine. I'd now like to do the same thing, but on a machine hosted in AWS. > > I'm using the same operating system, and as far as I can tell all my > versions are identical. The build of MGiza++ runs fine, reports no errors, > and produces output the same as on my desktop machine. However, when I try > to build the models, I get a whole load of errors and the resultant models > are empty (64 bytes for the reordering model, 0 bytes for the translation > model - the language model builds fine). > > The first "errors" I can see in the log seem to occur on stage 4 of the > Moses training script (train-model.perl): > >(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC > 2018 >(/opt/model-builder/training/data.ru,/opt/model-builder/ > training/data.en,/opt/model-builder/training/model/lex) >!Argument "anna" isn't numeric in numeric ge (>=) at /opt/model-builder/ > mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, > line 1. >Use of uninitialized value $ei in numeric ge (>=) at /opt/model-builder/ > mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, > line 1. >Use of uninitialized value $ei in hash element at /opt/model-builder/ > mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, > line 1. >Use of uninitialized value $ei in array element at /opt/model-builder/ > mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, > line 1. >Use of uninitialized value $ei in array element at /opt/model-builder/ > mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, > line 1. >... > > There are a large number of errors of that nature, and following those > errors there are additional errors but I suspect these are caused by the > fact that this stage is failing. > > It's possible that there are earlier problems, but I'm not really sure > what to be looking for in the logs (for instance - there are some lines > warning about alignments in Model2 being 0 - is that an issue?). > > If I replace the MGiza binaries built on the AWS machine with the binaries > built on my desktop, it runs fine - so I know it's an issue with MGiza and > presumably something to do with my build. The commands I'm running to build > and install are as follows > >git clone https://github.com/moses-smt/mgiza.git >cd mgiza/mgizapp >cmake . >make >make install >cp bin/* ../../mosesdecoder/bin >cp scripts/merge_alignment.py ../../mosesdecoder/bin > > As I mentioned previously, these commands work fine on my desktop machine > which should be a very similar (if not identical) set up. > > Does anyone have any ideas as to what might be causing the problem (or, > more importantly, what I can do to fix it)? > > Thanks in advance, > James > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Issues running MGiza on AWS machine
Hi James, Since train-model.perl fails at step 4 fails with the MGIZA binaries you build on your AWS machine, but succeeds when you copy MGIZA binaries that you built on your local Ubuntu 16.04 machine, do the build logs show a missing dependency? My next question, why don't you just use the binaries that work? It seems like the AWS machine's Ubuntu distro is missing dependencies and the MGIZA++ build failed. Check those build logs. If you want to troubleshoot deeper, you need to backtrack from step 4. The train-model.perl step 4 uses the output from step 3, i.e. the word alignment file. Check if that word alignment file is corrupted. Then check step 3, its inputs are the GIZA alignment files output in step 2. This step uses the symal binary executable. Make sure you're using the Moses version of symal, not the one in the MGIZA library. http://article.gmane.org/gmane.comp.nlp.moses.user/11544 http://moses-support.mit.narkive.com/KpKC2TQn/which-symal Backtracking to step 2, log lines with the following text messages should cause the mgiza executable to terminate but it doesn't. The parallel forks in train-model.perl mask the failure, processing continues and you experience ambiguous failures downstream. ERROR: A SOURCE or TARGET sentence has a zero-length sentence. ERROR! DUPLICATED ENTRY WARNING: The following sentence pair has source/target sentence length ration more than There are rarely errors in Step 1, but if you are experiencing a compile error on AWS, those MGIZA binaries in step 1 could be the cause. Also, the C++ binary executables are not the only things that change when you use use the alternate build. If you also copied the merge_alignment.py, this could be a problem in train-model.perl step 2. Make sure the AWS build has this in the right place and that it runs on the AWS distro's Python interpreter. Tom On 7/31/2018 11:01 PM, moses-support-requ...@mit.edu wrote: Date: Tue, 31 Jul 2018 11:41:01 +0100 From: James Baker Subject: [Moses-support] Issues running MGiza on AWS machine To:moses-support@mit.edu Message-ID: Content-Type: text/plain; charset="utf-8" Hi, I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've successfully built some translation models on my Ubuntu 16.04 desktop machine. I'd now like to do the same thing, but on a machine hosted in AWS. I'm using the same operating system, and as far as I can tell all my versions are identical. The build of MGiza++ runs fine, reports no errors, and produces output the same as on my desktop machine. However, when I try to build the models, I get a whole load of errors and the resultant models are empty (64 bytes for the reordering model, 0 bytes for the translation model - the language model builds fine). The first "errors" I can see in the log seem to occur on stage 4 of the Moses training script (train-model.perl): (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018 (/opt/model-builder/training/data.ru ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) !Argument "anna" isn't numeric in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in hash element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, line 1. ... There are a large number of errors of that nature, and following those errors there are additional errors but I suspect these are caused by the fact that this stage is failing. It's possible that there are earlier problems, but I'm not really sure what to be looking for in the logs (for instance - there are some lines warning about alignments in Model2 being 0 - is that an issue?). If I replace the MGiza binaries built on the AWS machine with the binaries built on my desktop, it runs fine - so I know it's an issue with MGiza and presumably something to do with my build. The commands I'm running to build and install are as follows git clonehttps://github.com/moses-smt/mgiza.git cd mgiza/mgizapp cmake . make make install cp bin/* ../../mosesdecoder/bin cp scripts/merge_alignment.py ../../mosesdecoder/bin As I mentioned previously, these commands work fine on my desktop machine which should be a very similar (if not identical) set up. Does anyone have any ideas as to wh
[Moses-support] Issues running MGiza on AWS machine
Hi, I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've successfully built some translation models on my Ubuntu 16.04 desktop machine. I'd now like to do the same thing, but on a machine hosted in AWS. I'm using the same operating system, and as far as I can tell all my versions are identical. The build of MGiza++ runs fine, reports no errors, and produces output the same as on my desktop machine. However, when I try to build the models, I get a whole load of errors and the resultant models are empty (64 bytes for the reordering model, 0 bytes for the translation model - the language model builds fine). The first "errors" I can see in the log seem to occur on stage 4 of the Moses training script (train-model.perl): (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018 (/opt/model-builder/training/data.ru ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) !Argument "anna" isn't numeric in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in numeric ge (>=) at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 112, line 1. Use of uninitialized value $ei in hash element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 118, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 121, line 1. Use of uninitialized value $ei in array element at /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm line 123, line 1. ... There are a large number of errors of that nature, and following those errors there are additional errors but I suspect these are caused by the fact that this stage is failing. It's possible that there are earlier problems, but I'm not really sure what to be looking for in the logs (for instance - there are some lines warning about alignments in Model2 being 0 - is that an issue?). If I replace the MGiza binaries built on the AWS machine with the binaries built on my desktop, it runs fine - so I know it's an issue with MGiza and presumably something to do with my build. The commands I'm running to build and install are as follows git clone https://github.com/moses-smt/mgiza.git cd mgiza/mgizapp cmake . make make install cp bin/* ../../mosesdecoder/bin cp scripts/merge_alignment.py ../../mosesdecoder/bin As I mentioned previously, these commands work fine on my desktop machine which should be a very similar (if not identical) set up. Does anyone have any ideas as to what might be causing the problem (or, more importantly, what I can do to fix it)? Thanks in advance, James ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support