moses-parallel.pl and mert-moses.pl were changed . Now they works well with lattice inputs, too.
Notice that you do NOT need to specify -decoder-flags "-inputtype 2" the parameter --inputtype 2 of mert-moses.pl is passed to the decoder automatically. best, Nicola ________________________________________ From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of ThuyLinh Nguyen [EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 5:16 PM To: moses-support@mit.edu Subject: [Moses-support] [Fwd: Run mert-moses.pl with confusion network] Hello, just another mistake, mert-moses.pl can't find the phrasetable in binary format but if run translation without mert, it works here is the error: perl mert-moses.pl ../../sstmorph/dev.ar.lattice ../../dev.en.process ../../../moses-cmd/src/moses ./moses.ini --decoder-flags "-inputtype 2" --inputtype 2 --rootdir /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts --no-filter-phrase-table After default: -l mem_free=0.5G -hard Using SCRIPTS_ROOTDIR: /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts checking weight-count for ttable-file moses.ini:15:File does not exist or empty: /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin checking weight-count for lmodel-file SYNC distortionExit 1 but if I run without mer, it works head -2 ../../sstmorph/dev.ar.lattice | ../../../moses-cmd/src/moses -f ./moses.ini -inputtype 2 Thanks Linh Chris Dyer wrote: > I'll update that- the inputtype should be "2" for lattices... > Chris > > On Wed, Feb 27, 2008 at 4:39 AM, ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote: > >> Hi Chris, >> Thanks for clarification, so the lattice format is different with confusion >> network format >> but in moses binary, there are only two options for inputtype: -inputtype: >> text (0) or confusion network (1) >> >> It does n't recognize the lattice format input. >> This is an example of lattice translation error: >> >> echo "((('A',1.0,1),),(('B',1.0,1),),)" | moses -f moses.ini -inputtype 1 >> Defined parameters (per moses.ini or switch): >> config: moses.ini >> distortion-limit: 6 >> input-factors: 0 >> inputtype: 1 >> lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm >> mapping: 0 T 0 >> ttable-file: 0 0 5 >> /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin >> ttable-limit: 20 0 >> weight-d: 0.6 >> weight-l: 0.5000 >> weight-t: 0.2 0.2 0.2 0.2 0.2 >> weight-w: -1 >> Loading lexical distortion models... >> have 0 models >> Start loading LanguageModel /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm : >> [0.000] seconds >> Finished loading LanguageModels : [0.000] seconds >> Start loading PhraseTable >> /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin : >> [0.000] seconds >> using binary phrase tables for idx 0 >> reading bin ttable >> size of OFF_T 8 >> binary phrasefile loaded, default OFF_T: -1 >> Finished loading phrase tables : [0.000] seconds >> IO from STDOUT/STDIN >> Created input-output object : [0.000] seconds >> read confusion net with format 0 >> End. : [0.000] seconds >> confusion net statistics: >> created: 1 >> destroyed: 1 >> succ. read: 0 >> columns: 0 >> words: 0 >> avg. word/column: nan >> avg. cols/sent: nan >> >> >> Let me know if I made mistake somewhere. >> >> Thanks >> Linh >> >> >> >> >> >> Chris Dyer wrote: >> >> I am still confused about the lattice format, >> In your examples: >> >> 1 ((('A',1.0,1),),(('B',1.0,1),),) >> 2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),) >> >> Can I interpret it as: >> from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and >> (('B',1.0,1),) >> >> Each entire lattice is encoded on a single line. In line 1, there are >> two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of >> the arc and the "1" is the length of the arc (measured in nodes). In >> line two, node 0 has two arcs, arc 'A' that goes to node 1 and arc 'Z' >> that goes to node 2. Node 1 has a single arc, 'B', that goes to node >> 2. Node 2 has a single arc 'C' that goes to 3. >> >> >> >> >> And also what are the meaning of number 1.0 and 1, 2 there? where can I put >> the lattice probabilities? >> Is it possible to add an empty lattice (so that the decoder skip a word)? >> >> Currently, moses only lets you specify a single cost for an arc, and >> it is actually treated as a probability (the decoder sees it as >> -log(p) -- you can change this in WordLattice.cpp if you want to deal >> with more conventional costs, but the rest of the inputs to the >> decoder are given as probabilities so I wanted to be consistent). If >> you want a null transition, set the arc label to '*eps*' and the >> decoder will treat this as a null. >> >> --Chris >> >> >> >> >> Linh >> >> >> >> >> Chris Dyer wrote: >> >> >Also, if you are using general lattices (as opposed to regular >> >confusion networks) as input, you should update to the latest version >> >of the decoder from Subversion, since I checked in a fairly crucial >> >bug fix yesterday. >> > >> >Chris >> > >> >On Wed, Feb 20, 2008 at 4:37 PM, Chris Dyer <[EMAIL PROTECTED]> wrote: >> > >> > >> >>The lattice format isn't documented yet on the webpage, but you can >> >> see some examples of it in the lattice-distortion test directory Hieu >> >> mentions. It should be fairly straightforward to decipher. Since >> >> this format encodes a single lattice/CN per line of text, it can be >> >> used easily with MER training. >> >> >> >> Chris >> >> >> >> >> >> >> >> On Wed, Feb 20, 2008 at 4:30 PM, Hieu Hoang <[EMAIL PROTECTED]> >> wrote: >> >> > chao anh/chi linh >> >> > >> >> > i'm not sure if anyone answered your question and i'm prob not the >> best >> >> > person to answer question on lattice/confusion net input. to my >> knowledge, >> >> > mert should run fine with these input types. >> >> > >> >> > perhaps you can find an example of the lattice input format from the >> >> > regression test : >> >> > >> >> > >> http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/regression-testing/tests/ >> >> > >> >> > >> >> > >> >> > ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote: >> >> > >> >> > >> >> > -------- Original Message -------- >> >> > Subject: Run mert-moses.pl with confusion network >> >> > Date: Sat, 16 Feb 2008 21:33:44 -0500 >> >> > From: ThuyLinh Nguyen >> >> > To: moses-support@mit.edu >> >> > >> >> > >> >> > >> >> > Hello, >> >> > I want to run mer for a development set which is the output of other >> >> > translation job. >> >> > therefore the development input is a set of lattices. Are there anyway >> >> > to run MER with lattice input and if so how can i represent the >> lattice >> >> > of multiple sentences? >> >> > Thank you >> >> > Linh >> >> > >> >> > >> >> > _______________________________________________ >> >> > Moses-support mailing list >> >> > Moses-support@mit.edu >> >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > >> >> > >> >> > >> >> > >> >> > Hieu Hoang >> >> > http//www.hoang.co.uk/hieu >> >> > >> >> > ________________________________ >> >> > >> >> > Sent from Yahoo! - a smarter inbox. >> >> > _______________________________________________ >> >> > Moses-support mailing list >> >> > Moses-support@mit.edu >> >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > >> >> > >> >> >> >> >> >> >> > >> > >> > >> >> >> >> >> >> > > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support