[Moses-support] Moses Build with Boost 1.68.0 Fails
Hello, I'm currently trying to build Moses with Boost 1.68.0 (not the latest version, but this is for compatibility reasons with a project I'm using for my thesis). However, the build fails with the following error (full error log is attached): /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/tools/types/docbook.jam:8: in load ERROR: rule "Copyright" unknown in module "docbook". And here is the exact command I used to build Moses: ./bjam --with-boost=opt/boost_1_68_0 Is this a known problem, and is there a fix besides using a more current version of Boost? I've also tried with the latest version (1.85.0), but then I receive another error since the boost_1_85_0/tools/build/src/bootstrap.jam file seems to be missing in the latest version. Best, Andrew notice: found boost-build.jam at /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/boost-build.jam notice: loading Boost.Build from /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/tools/types/docbook.jam:8: in load ERROR: rule "Copyright" unknown in module "docbook". /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/modules.jam:295: in import /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/tools/types/register.jam:36: in load /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/modules.jam:295: in import /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/tools/stage.jam:18: in load /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/modules.jam:295: in import /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/tools/builtin.jam:27: in load /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/modules.jam:295: in import /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/build-system.jam:12: in load /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/modules.jam:295: in import /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/tools/build/src/kernel/bootstrap.jam:139: in boost-build /home/users/abeaton/project/thesis/VTLM2/data/mosesdecoder2/opt/boost_1_68_0/boost-build.jam:17: in module scope___ Moses-support mailing list Moses-support@mit.edu https://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] mgiza install
Hi Hieu, The symptom was an error message from tar (from memory now, sorry, I should've captured it) -- but something like "unexpected EOF in archive". This was from running "tar -zxf" I believe the underlying cause was .gz related because if I tried to gunzip first, the error message was that the file was not in gzip format, which suggested a download problem from Filezilla. But it downloaded and unpacked fine from Filezilla on my host machine, so given a few courses of action (diagnose download problem / update gzip / try Ubuntu VM) I chose to try Ubuntu and got much further. However, still some problem in training, including not finding the 'training' scripts dir, and not making a 'giza.[$TGT-$SRC|$SRC-$TGT]' dir (?) -- any guidance much appreciated. Andrew On 11 August 2016 at 09:06, Hieu Hoang wrote: > Thanks. What was the problem with gunzip on fedora? > > Prob a good idea to make future VM on Ubuntu since that's what everyone > seem to use these days > > On 9 Aug 2016 16:27, "Andrew Caines" wrote: > >> Hi Hieu, >> >> Thank you for suggesting the Linux VMs. I encountered an issue with the >> Fedora one with gunzip and so switched to the Ubuntu one, in which gunzip >> works fine (i.e. for unpacking downloaded corpora). Not your doing I know, >> but perhaps worth updating and repackaging the Fedora VM? >> >> Secondly there were a couple of edits needed to get the mgiza scripts >> running. Should I send them to you here or on GitHub? >> >> Finally, I encountered a bus error in plain2snt-hasvcb.py when training >> and some subsequent path errors -- specifically, could not find "giza. >> $TGT-$SRC" or "giza.$SRC-$TGT" (should this be 2 more mkdirs in >> force-align-moses.sh?) and could not find "$SCRIPT_DIR/training" which >> indeed seems to be missing from mgizapp unless I've not looked carefully >> enough. >> >> As always, happy to be corrected if I'm doing anything wrong. >> Andrew >> >> >> On 26 July 2016 at 22:42, Hieu Hoang wrote: >> >>> that's a weird error. This looks the same as this: >>> >>>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65306 >>> >>> sounds like you're digging yourself a bigger hole. >>> >>> my suggestion if you must use OSX, especially for teaching - use a Linux >>> virtual machine. We even make 1 available for download with everything >>> installed: >>> >>>http://www.statmt.org/moses/RELEASE-3.0/vm/fedora%2021%2064-bit.ova >>> >>> On 26/07/2016 16:15, Andrew Caines wrote: >>> >>> Hello, >>> I encountered an error in running 'make' with mgiza: >>> >>> [ 10%] Building CXX object src/CMakeFiles/mgiza_lib.dir/F >>> orwardBackward.cpp.o >>> >>> /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:989:suffix >>> or operands invalid for `movq' >>> >>> /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:993:suffix >>> or operands invalid for `movq' >>> >>> /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1011:suffix >>> or operands invalid for `movq' >>> >>> /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1467:suffix >>> or operands invalid for `movq' >>> >>> /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:2527:suffix >>> or operands invalid for `movq' >>> >>> make[2]: *** [src/CMakeFiles/mgiza_lib.dir/ForwardBackward.cpp.o] Error >>> 1 >>> I'm on Mac OS X 10.11.5 (El Capitan). I appreciate that OS X is untested >>> compared to Linux, but it'd be great to get this set up, for purposes of >>> teaching & collaboration in DH. >>> >>> thanks in advance, Andrew >>> >>> >>> ___ >>> Moses-support mailing >>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >>> >> ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] mgiza install
Hi Hieu, Thank you for suggesting the Linux VMs. I encountered an issue with the Fedora one with gunzip and so switched to the Ubuntu one, in which gunzip works fine (i.e. for unpacking downloaded corpora). Not your doing I know, but perhaps worth updating and repackaging the Fedora VM? Secondly there were a couple of edits needed to get the mgiza scripts running. Should I send them to you here or on GitHub? Finally, I encountered a bus error in plain2snt-hasvcb.py when training and some subsequent path errors -- specifically, could not find "giza.$TGT-$SRC" or "giza.$SRC-$TGT" (should this be 2 more mkdirs in force-align-moses.sh?) and could not find "$SCRIPT_DIR/training" which indeed seems to be missing from mgizapp unless I've not looked carefully enough. As always, happy to be corrected if I'm doing anything wrong. Andrew On 26 July 2016 at 22:42, Hieu Hoang wrote: > that's a weird error. This looks the same as this: > >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65306 > > sounds like you're digging yourself a bigger hole. > > my suggestion if you must use OSX, especially for teaching - use a Linux > virtual machine. We even make 1 available for download with everything > installed: > >http://www.statmt.org/moses/RELEASE-3.0/vm/fedora%2021%2064-bit.ova > > On 26/07/2016 16:15, Andrew Caines wrote: > > Hello, > I encountered an error in running 'make' with mgiza: > > [ 10%] Building CXX object src/CMakeFiles/mgiza_lib.dir/F > orwardBackward.cpp.o > > /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:989:suffix > or operands invalid for `movq' > > /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:993:suffix > or operands invalid for `movq' > > /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1011:suffix > or operands invalid for `movq' > > /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1467:suffix > or operands invalid for `movq' > > /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:2527:suffix > or operands invalid for `movq' > > make[2]: *** [src/CMakeFiles/mgiza_lib.dir/ForwardBackward.cpp.o] Error 1 > I'm on Mac OS X 10.11.5 (El Capitan). I appreciate that OS X is untested > compared to Linux, but it'd be great to get this set up, for purposes of > teaching & collaboration in DH. > > thanks in advance, Andrew > > > ___ > Moses-support mailing > listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support > > > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] mgiza install
Hello, I encountered an error in running 'make' with mgiza: [ 10%] Building CXX object src/CMakeFiles/mgiza_lib.dir/ ForwardBackward.cpp.o /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:989:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:993:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1011:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1467:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:2527:suffix or operands invalid for `movq' make[2]: *** [src/CMakeFiles/mgiza_lib.dir/ForwardBackward.cpp.o] Error 1 I'm on Mac OS X 10.11.5 (El Capitan). I appreciate that OS X is untested compared to Linux, but it'd be great to get this set up, for purposes of teaching & collaboration in DH. thanks in advance, Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] mgiza install
Thank you Hieu, I will switch to Linux when I can. For the time being with OSX I removed "-lrt" from CMakeList.txt and encountered a new error in '$ make' -- [ 10%] Building CXX object src/CMakeFiles/mgiza_lib.dir/ForwardBackward.cpp.o /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:989:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:993:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1011:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:1467:suffix or operands invalid for `movq' /var/folders/p8/8d4g5v_53t3glcsclzx5cc8hgn/T//cckNoVZn.s:2527:suffix or operands invalid for `movq' make[2]: *** [src/CMakeFiles/mgiza_lib.dir/ForwardBackward.cpp.o] Error 1 On 20 July 2016 at 16:35, Hieu Hoang wrote: > In CMakeList.txt line 39, delete -lrt > > This is also a problem when compiling moses. I hack it by creating an > empty rt library with the code in >mosesdecoder/contrib/rt > and copying it into /usr/lib > > To be honest, there will be other problems when running moses on osx, not > just compiling. Moses & most MT programs are mostly developed Linux people > so many of the programs and scripts are untested on OSX. There are niggly > differences between OSX and Linux such as sort, split, zcat etc. > > You are better off getting some experience using them on Linux and then > switch to OSX once you know everything works. And I'm talking from my > experience as a Mac user > > > Hieu Hoang > http://www.hoang.co.uk/hieu > > On 20 July 2016 at 14:44, Andrew Caines wrote: > >> Hello, >> >> I've encountered a problem installing mgiza, downloaded from GitHub. >> https://github.com/moses-smt/mgiza >> >> 'cmake .' runs fine. >> >> 'make' exits with the following message: >> [ 51%] Linking CXX executable ../bin/d4norm >> ld: library not found for -lrt >> clang: error: linker command failed with exit code 1 (use -v to see >> invocation) >> make[2]: *** [bin/d4norm] Error 1 >> >> I'm on Mac OS X 10.11.5 (El Capitan). Any advice much appreciated. >> >> Andrew >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] mgiza install
Hello, I've encountered a problem installing mgiza, downloaded from GitHub. https://github.com/moses-smt/mgiza 'cmake .' runs fine. 'make' exits with the following message: [ 51%] Linking CXX executable ../bin/d4norm ld: library not found for -lrt clang: error: linker command failed with exit code 1 (use -v to see invocation) make[2]: *** [bin/d4norm] Error 1 I'm on Mac OS X 10.11.5 (El Capitan). Any advice much appreciated. Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] PhraseDictionaryCompact is not registered
I'm following the baseline system page step-by-step as it says.I've binarized the phrase table and reordering table using processPhraseTableMin and processLexicalTableMin,edited the moses.ini as written, but upon executing, it gives an exception with "PhraseDictionaryCompact is not registered" message. I've done some googling, and tried running processLexicalTable (without "Min") to no good,and also tried editing as PhraseDictionaryBinary, PhraseDictionaryOnDisk, which succeeded in running the task, but gets aborted upon writing the input sentence. Is there be any other workaround / fix to this? ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] how to train with berkeley word aligner
I'm replicating the steps described in Baseline System page,and am about to run the following command,except I want to run it with pre-compiled berkeley word aligner rather than mgiza or giza++ (since their installations have been troublesome)Should I fix the command below or code in somewhere?In any case, how should it be fixed? mkdir ~/working cd ~/working nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train \ -corpus ~/corpus/news-commentary-v8.fr-en.clean \ -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe \ -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 \ -external-bin-dir ~/mosesdecoder/tools >& training.out & ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] n-grams in source language
Hello, I know that language model is constructed using n-gram from target language,but where does the process that compares the n-grams used in source language to n-grams used in target language take place? i.e. In what stage? ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Difference between lexical reordering and distortion?
Hi, in the tutorial (http://www.statmt.org/moses/?n=Moses.Tutorial), distortion model is said to be responsible for the reordering of the input, but in moses.ini file, there are separate weights for lexical reordering and distortion model. So I was wondering how they are different. Thank you in advance for your help. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] role of lexical-weighting
Hello, I am not very sure what lexical-weighting is about.Does it take semantic resemblance into consideration? If so, how? And if not, how does it differ from usual alignment?How would using no-lexical-weighting option in training affect the alignments? I apologize that the question is rather broad, but I I would greatly appreciate your advice. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Some of the confusing concepts
Hello, I'm trying to get a complete picture of how Moses works, and here are some of the parts in which I failed to grab a definitive understanding. I apologize that this may be a bit verbose, but I would greatly appreciate it if you could help me better understand the nature of Moses and SMT. 1) In GIZA++, what are the order and number of iterations for each model?It seems like the order is Model 1->Model 2->HMM -> Model 3-> Model 4 by default, but I'm not sure how many iterations of each runs by default. 2) In GIZA++, is it right that source word cannot be aligned to more than one word in target language? What about the opposite? And can we have a case where multiple source words are aligned to the same target word, and vice versa? What would happen in an extreme case where source sentence is only one word, and target sentence is, say, 10 words? 3) From what I've read, it seems like all possible alignments are counted at first, and alignment probability for each word is calculated based on those counts. If so, in case where |source| < |target|, which source word is likely to get aligned to empty word? My understanding is that it would be the word with lowest alignment probability in regard to target words, and a word with high fertility probability for n=0. 4) If we opt not to use reordering table in moses.ini, will the distortion limit be meaningless? Also in that case, will the grammaticality be dependent only on the language model? 5) If GIZA++ aligns words in both directions, why does it matter which one is source and which one is target? Is there difference in weights? Or is it because of the restriction that source word can only be aligned to one target word?___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] using Moses in Monolingual dialogue setting
Thanx for the insights. I've already done approach 2, and the result didn't seem bad to me,so I became curious if it would've made significant difference had I chosen the first approach.I was worried that approach 2 might've resulted in over-training, but judging from your comments, I guess it's only a matter of having broader entries. (or could it have been over-trained?) > I suppose my main concern would be the inordinate amounts of training data > you would need to get something useful up and running. This leads me to my next question.I trained my system with about 650k pairs of stimulus-response collected from Twitter.Each pair is part of a conversation which consists of 3~10 utterances.For example, suppose we have a conversation that has 4 utterances labeled A,B,C,D where A is the "root"of the conversation, and B is the response to A, C is the response to B, and D is the response to C.Following my second approach, A and B, B and C, C and D are pairs, so source file will contain A,B,C and target file will contain B,C,D, making 3 pairs from 1 conversation. In this way, I have 650k pairs from about 80k conversations. I've seen that when you use Moses for actual translation task, say German to English, the amount of training data seems pretty low, somewhere around 50K. So my 650k is already much bigger than this. However, in the paper that I mentioned http://aritter.github.io/mt_chat.pdf the author used about 1.3M pairs, which is twice bigger than mine, and I've seen research in similar setting http://www.aclweb.org/anthology/P13-1095 which used 4M pairs.(!) So, given the unpredictable nature of monolingual conversation setting, what would you think is the appropriate, or minimum amount of training data? And how much would the quality of the response-generation task depend on the amount of training data? I know this is out-of-nowhere question which may be hard to answer, but even a rough guess would great me assist me. Thank you very much in advance. > From: jcr...@essex.ac.uk > To: kgim...@cs.cmu.edu > Date: Mon, 9 Dec 2013 17:33:00 + > CC: moses-support@mit.edu > Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting > > I guess if you were to change the subject and ask a question from a list of > well formed common questions if the probability of the response is below some > sensible threshold then you could make a system which fools a user some of > the time. > > James > > > From: moses-support-boun...@mit.edu [moses-support-boun...@mit.edu] on behalf > of Read, James C [jcr...@essex.ac.uk] > Sent: 09 December 2013 17:14 > To: Kevin Gimpel > Cc: moses-support@mit.edu > Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting > > I'm guessing he wants to make a conversational agent that produces a most > likely response based on the stimulus. > > In any case, the distinction between 1 and 2 is probably redundant if GIZA++ > is being used to train in both directions. The two phrase tables could be > merged I guess. I guess the advantage of 2 over 1 is that you don't need to > worry about the merging logic at the cost of more training time. > > I'm not sure I understand the question of A1~B3. Unless I'm reading his > question wrong I don't see how this could happen. > > I suppose my main concern would be the inordinate amounts of training data > you would need to get something useful up and running. > > James > > ____ > From: kgim...@gmail.com [kgim...@gmail.com] on behalf of Kevin Gimpel > [kgim...@cs.cmu.edu] > Sent: 09 December 2013 15:17 > To: Read, James C > Cc: Andrew; moses-support@mit.edu > Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting > > Hi Andrew, it's an interesting idea.. I would guess that it would depend on > what the data look like. If the A's and B's are of fundamentally different > type (e.g., they are utterances in an automatic dialogue system, where A's > are always questions and B's are always responses), then approach 2 seems a > bit odd as it will conflate A's and B's utterances. However, if the A's and > B's are just part of a conversation, e.g., in IM chats, then they are of the > same "type" and approach 2 would make sense. In fact, I think approach 2 > would make more sense than approach 1 in that case. It also of course > depends on how you want to use the resulting translation system. > Kevin > > > > On Mon, Dec 9, 2013 at 5:18 AM, Read, James C > mailto:jcr...@essex.ac.uk>> wrote: > Are you trying to figure out the probability of a response given a stimulus? &g
[Moses-support] using Moses in Monolingual dialogue setting
Hi, I'm using Moses in monolingual dialogue setting as in http://aritter.github.io/mt_chat.pdf,where source and target are both in English and target is a response to source.I'd like to propose a little thought experiment in this setting, and hear what you think would happen. Suppose we have a conversation with six utterances, A1,B1,A2,B2,A3,B3 where A and B indicate speakers,and the number indicates n-th statement by the speaker. They are all in one conversation of continuous topic. Now suppose we train it using Moses in two different ways as following:1) Source file contains A1, A2, A3 and target contains B1, B2, B3 so that A1-B1 is a pair and so on.2) Source contains A1,B1,A2,B2,A3 and target contains B1,A2,B2,A3,B3, taking advantage of the fact that response is a stimulus to the next response. Then, How will the results be different and why?Since GIZA++ gets alignment in both directions, will 2) result in any of A1~B3 being the translation of any other? This may be a strange question, but I would really like to get your insight. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] removing identical pairs in phrase table
Hi, I'm using Moses in dialogue setting, where both source and target language are in English.This results in parroting back of the input, so I need to remove phrase-pairs where one phrase is a substring of the other. That'll require editing the phrase table after GIZA++ but before Moses.Is there a way to automate this? If not, how can I access the phrase table generated by GIZA++ and run Moses on that after editing? I appreciate your time in advance.___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] unsolved) Moses eating up too much memory
I apologize since my previous question on the same topic didn't specify enough.I'm following the steps in http://www.statmt.org/moses/?n=Moses.Baseline exactly the same way,and when I try to run Moses with the following command:~/mosesdecoder/bin/moses -f ~/working/mert-work/moses.ini (often with -output-hypo-score option)it starts eating up RAM until there's no space and starts taking hard disk space as swap memory,while it says its reading from phrase table.It went up to 17GB of swap memory and pretty much stopped.I binarised the phrase table and reordering table, and the following are the sizes of the binarised files-rw-r--r-- 1 ipadmin staff 856K Oct 28 10:45 phrase-table.binphr.idx-rw-r--r-- 1 ipadmin staff 140M Oct 28 10:45 phrase-table.binphr.srctree.wa-rw-r--r-- 1 ipadmin staff 1.6M Oct 28 10:45 phrase-table.binphr.srcvoc-rw-r--r-- 1 ipadmin staff 1.2G Oct 28 10:45 phrase-table.binphr.tgtdata.wa-rw-r--r-- 1 ipadmin staff 1.5M Oct 28 10:45 phrase-table.binphr.tgtvoc-rw-r--r-- 1 ipadmin staff 856K Oct 28 11:00 reordering-table.binlexr.idx-rw-r--r-- 1 ipadmin staff 1.3G Oct 28 11:00 reordering-table.binlexr.srctree-rw-r--r-- 1 ipadmin staff 916M Oct 28 11:00 reordering-table.binlexr.tgtdata-rw-r--r-- 1 ipadmin staff 1.6M Oct 28 11:00 reordering-table.binlexr.voc0-rw-r--r-- 1 ipadmin staff 1.5M Oct 28 11:00 reordering-table.binlexr.voc1I also attached moses.ini which I tried to run.I'm very sure that this is not the way it should be, but can't figure out what is wrong since I followed the tutorial above..(and it's worked before) Please let me know if there is any additional information I should provide.I sincerely thank you very much for your time. moses.ini Description: Binary data ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] multi-threaded tuning not working
Hi, I've done tuning without the multi-core option,and it took about 20 hours for 650,000 lines of input files, and 5000 lines of tuning reference.I tried tuning with the multi-core option as--decoder-flags="-threads 4"but it doesn't initiate anything.I have macbook pro with 2.4 GHz i5 dual core.Is there any known issue with this option? Thank you for your help in advance. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] getting WER metrics
Hello,sorry to ask another question.. I've done getting BLEU score in the past following the baseline tutorial,but is there a way to also get WER given a reference text? Thank you very much for your help.___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Running moses fills up hard disk space..
Hello, I'm running Moses after binarising it,and while it's reading from phrase table, it rapidly fills up hard disk space until there is no more space..I had 21GB and it wasn't enough..Once I quit terminal, the space is freed again..I successfully ran Moses before, so I don't understand why this happens..Is this natural? If so, how much reserve space do I need?and If not, what could have gone wrong? Thank you very much for your help in advance. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] a few supposedly simple questions..
Dear support team, thank you for your previous reply which worked out for me.I have a few questions which I think should be simple but couldn't find relevant information on the website. 1) When you run Moses and type in a sentence, is there any way you could have the translation withthe corresponding probability? 2) Also when you run and type in a sentence, is there a way to have not just one translation,but N-best candidates? (preferably with corresponding probabilities, which was the first question..) 3) I've done getting BLEU score using moses, but is there a way to also get word error rate to a reference? 4) After cleaning process, moses shows the number of lines in input and output text files,but I noticed that number of lines in output file decreased about 5%,resulting in non-matching number of linesfor input and output.Looking at the translation results, it seems like it worked fine somehow, but it gets me concerned.Why is it, and does it affect the line-match of input-output and the training process? I truly appreciate your help in advance. best,Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] question
Hello, I'm currently following the baseline system page (http://www.statmt.org/moses/?n=Moses.Baseline) and currently at the Training the Translation system part.I executed the command as in the tutorial, but the process finished instantly instead of supposed 1~2 hours,and instead of working/train/model folder with moses.ini file in it,I have working/train/corpus folder with files named en.vcb.classes, fr.vcb.classes, en.vcb.classes.cats, fr.vcb.classes.cats, en.vcb, fr.vcb, en-fr-int-train.snt, fr-en-int-train.snt.(Btw, I had to repeat the command a few times to get these files. On the first run, there were only two files, and then next time two more, and so on.) So it seems like something is being done, but I'm not getting moses.ini file..I've attached my training.out in case it helps.. This might be a tedious question to look through, but please help me out..Thank you very much. sincerely,Andrew training.out Description: Binary data ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] error on training
Hello, I'm a novice to Moses, so I'm following the Baseline System page (http://www.statmt.org/moses/?n=Moses.Baseline) and got stuck at Training the Translation System part.When I enter the command starting with "nohup nice " as in the instruction,I only get a file named training.out and no new directory.The training.out file says:Using SCRIPTS_ROOTDIR: /Users/ipadmin/Desktop/mosesdecoder/scriptsUsing single-thread GIZAERROR: use --f to specify foreign language at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 368. So I've then tried the new way in the Training page (http://www.statmt.org/moses/?n=FactoredTraining.HomePage) with modification on euro and the language:train-model.perl -root-dir . --corpus corpus/euro --f de --e enand I get errors as following:Using SCRIPTS_ROOTDIR: /Users/ipadmin/Desktop/mosesdecoder/scriptsUse of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 236.Use of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 237.Use of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 244.Use of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 245.Use of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 247.Using single-thread GIZAUse of uninitialized value $_EXTERNAL_BINDIR in concatenation (.) or string at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 335.ERROR: Cannot find mkcls, GIZA++/mgiza, & snt2cooc.out/snt2cooc in .You MUST specify the parameter -external-bin-dir at /Users/ipadmin/Desktop/mosesdecoder/scripts/training/train-model.perl line 335. I'm working on MacOSX 10.7 have successfully installed Moses and its prerequisites, and Giza++. I know this might be a stupid-user problem, but I will greatly appreciate it if you could give me a help.Thank you for your time. Sincerely,Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] pruning phrase tables that have more than one factor
Hi, I have some success pruning phrase-table.0-0 which consists of just one factor. When pruning a phrase table phrase-table.0,1,2-0,1,2 which consists of three factors surface, lemma and pos then my pruned output is empty. Example of an entry in phrase-table.0,1,2-0,1,2 Council|council|NN or|or|CC ||| consejo|consejo|NCMS000 y|y|CC que|que|CS ||| 0.17 0.00308919 0.5 0.00180623 2.718 ||| 0-0 1-1 ||| 62 1 My current source and target files that are used as input to IndexSA.O32 is the text from the parallel corpus used to get these phrase tables. Example from source file: Please let this not be yet another sector where we subsequently have to lament the lack of enforcement. Does anyone know what mt source and target input should look like to prune a table like phrase-table.0,1,2-0,1,2? Thanks Andrew On Thu, Jul 25, 2013 at 11:35 AM, Andrew Vine wrote: > Thanks, will do > > > On Wed, Jul 24, 2013 at 3:46 PM, Hieu Hoang wrote: > >> Probably >> >> Try it and let us know >> >> On 23 July 2013 21:44, Andrew Vine wrote: >> >>> Hi, >>> >>> I would like to prune some phrase tables following the method described >>> here.. >>> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc19 >>> >>> Could anyone tell me if I could still use filter-pt to prune even if my >>> phrase table has more than one factor? >>> >>> Many thanks >>> Andrew >>> >>> ___ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >> >> >> -- >> Hieu Hoang >> Research Associate >> University of Edinburgh >> http://www.hoang.co.uk/hieu >> >> > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] pruning phrase tables that have more than one factor
Thanks, will do On Wed, Jul 24, 2013 at 3:46 PM, Hieu Hoang wrote: > Probably > > Try it and let us know > > On 23 July 2013 21:44, Andrew Vine wrote: > >> Hi, >> >> I would like to prune some phrase tables following the method described >> here.. >> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc19 >> >> Could anyone tell me if I could still use filter-pt to prune even if my >> phrase table has more than one factor? >> >> Many thanks >> Andrew >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] pruning phrase tables that have more than one factor
Hi, I would like to prune some phrase tables following the method described here.. http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc19 Could anyone tell me if I could still use filter-pt to prune even if my phrase table has more than one factor? Many thanks Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] syntax based parsing of dependency tree
Hi, I would like use a dependency tree as input when training.. For example.. the dependency tree that I get from freeling for english looks like this: input: What is the time? output: wh-pro/top/(What what WP -) [ claus/modnorule/(is be VBZ -) [ sn-chunk/dobj/(time time NN -) [ DT/det/(the the DT -) ] in-brk/ta/(? ? Fit -) ] ] Previously I passed a shallow parsed tree to freeling. All the tokens were terminals and so I had no problem converting a shallow based tree to the xml format stipulated in http://www.statmt.org/moses/?n=Moses.SyntaxTutorial#ntoc22 However how do I send a tree like the one above to moses where tokens are attached to parent nodes? Currently I am using the span attribute.. so I am trying: What is time the ? But I am unsure if this is the way to go.. If anyone has some suggestions as always It would be much appreciated. Regards Andrew Vine ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] multiple factors in tree-to-tree models
Thanks for your reply. I know you said that yo have not tried this before but could you suggest what additional command line options I should use for tree-to-tree with factors? At the moment I am using: --source-syntax, --target-syntax and --glue-grammer Would I need to now add the following parameters with their various arguments depending on my factors: --translation-factors --generation-factors --decoding-steps Regards Andrew On Tue, May 28, 2013 at 6:14 AM, Hieu Hoang wrote: > i believe you can. I haven't tried it but if you run into any > difficulties, let me know. > > You should specify factors in the normal way, for terminals and > non-terminals, eg. >the|DET > > > > > On 27 May 2013 20:38, Andrew Vine wrote: > >> Hi, >> >> Is it possible to train a tree-to-tree based model with factors? >> Eg.. Say I wanted to train using the lemma as an additional factor.. >> >> Would this work? >> the | the >> cats | cat >> >> Regards >> Andrew >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] multiple factors in tree-to-tree models
Hi, Is it possible to train a tree-to-tree based model with factors? Eg.. Say I wanted to train using the lemma as an additional factor.. Would this work? the | the cats | cat Regards Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] StaticData distortion model format check
> that's an annoying problem. I'm thinking of transitioning the ini file > to xml, then these issues will disappear. However, other things need > to be done before that can happen. Let me know when you do. I will be looking forward to it. > If you really need to use spaces, you might want to delimit lines in > the ini file with tabs instead of spaces. then change the call to > Tokenize to: > > vector spec = Tokenize(fileStr[f], "\t"); That will work! I considered a few other changes, but this is much simpler. Andrew ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] StaticData distortion model format check
StaticData checks to make sure the number of entries under the parameter [distortion-file] by looking for spaces. vector spec = Tokenize(fileStr[f], " "); ++f; //mark file as consumed if(4 != spec.size()){ //wrong file specification string... std::cerr << "Wrong Lexical Reordering Model Specification for model " << i << "!\n"; return false; } My file is located in a directory who's path has some spaces. I cant just change it because the location of the directory is a sandbox, of which I have no control of the location. Any suggestions on how to format the ini file to not get any spaces but get to the correct directory? Thanks. Andrew___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Error in Parameter
The function ReadConfigFile will always return True... So this code will never be called: if (!ReadConfigFile(configPath)) { UserMessage::Add("Could not read "+configPath); return false; } Andrew___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses on the iPhone
As some of you may remember, I started this thread about a month or so ago. I managed to cross compile (i386, armv6) moses (will be happy to start a page on the wiki in the near future) and I am now working on getting an objective-c++ implementation of moses-cmd for use in my application. I am trying to get libmoses.a into an xcode and I am having all sorts of trouble. What linker flags are needed for libmoses? I attempted to look at the moses-cmd build to see what that was using, but it appeared some of the flags were extraneous. Also, Anyone have a background in objc++? It can be kinda gross. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses on the iPhone
It appears the svn has moved. Should be: svn co http://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk mosesdecoder -- Sláinte ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses on the iPhone
Thanks Hieu, I will look into all of the advice you have given. After looking at your page for the OLPC, it looks like you have chosen to offload the work to the server. I believe this was something we were trying to avoid, but I may have to reconsider. It would simplify somethings(While making other things more interesting of course...) Thanks again for your input. Andrew On Tue, 12 Jan 2010 12:37:53 +, Hieu Hoang wrote: hi andrew some of us have been working on putting moses onto the OLPC http://wiki.laptop.org/go/Projects/Automatic_translation_software [1] which has roughly the same resources as an iphone. We've got it working for reasonable size models my advice would be: 1. The moses-cmd shows you how to interact with the moses library. For normal decoding, it's quite simple. To make it even more simple for the gui developers, I would create a static library as a replacement for moses-cmd. Call the static library functions from your gui, rather than the moses functions directly 2. from what i know of ARM development, there are compiler switches to enable fast floating point operations. Make sure these are enabled. 3. the moses library assumes lots of memory so caches certain objects. Look throught this mailing list to see how to turn caching off. 4. Iphone apps can't run in the background so it would be best to have instant loading. This is not the case with any of our models, which can take some time to initialize. Speciically the phrase table and language models. You may have to write new implementations for them. 5. There may be littendian/bigendian issues with the binary phrase tables & language models. i.e you may not be able to create a binary phrase table/LM on your desktop and expect it to work on the iphone. i think its definitely doable, but don't expect just to be able to compile & go sounds like a fun project, let us know how it goes. On 11/01/2010 17:57, Andrew W. Haddad wrote: Hello, My name is Andrew Haddad. I am a Graduate Research Assistant at Purdue University. I have been given the task of getting moses working on the iphone. The moses package, which we have successfully installed and have running in simulation on the iphone will of course not work due to some limitations put for by Apple. I am going to be forced to cross compile the moses static library, used in moses-cmd, for the arm and i386 architecture. And then rewrite the functionality of moses-cmd to be used in our application. Do you know of anyone who has attempted something similar, that might be able to explain the process? -- Sláinte Andrew W. Haddad ___ Moses-support mailing list Moses-support@mit.edu [2] http://mailman.mit.edu/mailman/listinfo/moses-support [3] -- * Sláinte Links: -- [1] http://wiki.laptop.org/go/Projects/Automatic_translation_software [2] mailto:Moses-support@mit.edu [3] http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Moses on the iPhone
Hello, My name is Andrew Haddad. I am a Graduate Research Assistant at Purdue University. I have been given the task of getting moses working on the iphone. The moses package, which we have successfully installed and have running in simulation on the iphone will of course not work due to some limitations put for by Apple. I am going to be forced to cross compile the moses static library, used in moses-cmd, for the arm and i386 architecture. And then rewrite the functionality of moses-cmd to be used in our application. Do you know of anyone who has attempted something similar, that might be able to explain the process? -- Sláinte Andrew W. Haddad ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support