Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

ThuyLinh Nguyen Wed, 27 Feb 2008 01:39:42 -0800


Hi Chris,

Thanks for clarification, so the lattice format is different withconfusion network formatbut in moses binary, there are only two options for inputtype:-inputtype: text (0) or confusion network (1)


It does n't recognize the lattice format input.
This is an example of lattice translation error:

echo "((('A',1.0,1),),(('B',1.0,1),),)" | moses -f moses.ini -inputtype 1
Defined parameters (per moses.ini or switch):
       config: moses.ini
       distortion-limit: 6
       input-factors: 0
       inputtype: 1
       lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm
       mapping: 0 T 0

ttable-file: 0 0 5/SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin

       ttable-limit: 20 0
       weight-d: 0.6
       weight-l: 0.5000
       weight-t: 0.2 0.2 0.2 0.2 0.2
       weight-w: -1
Loading lexical distortion models...
have 0 models

Start loading LanguageModel/SMT/Workplace/Linh/IWSLT_0802/train.en.srilm : [0.000] seconds

Finished loading LanguageModels : [0.000] seconds

Start loading PhraseTable/SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin :[0.000] seconds

using binary phrase tables for idx 0
reading bin ttable
size of OFF_T 8
binary phrasefile loaded, default OFF_T: -1
Finished loading phrase tables : [0.000] seconds
IO from STDOUT/STDIN
Created input-output object : [0.000] seconds
read confusion net with format 0
End. : [0.000] seconds
confusion net statistics:
created:       1
destroyed:     1
succ. read:    0
columns:       0
words: 0
avg. word/column:      nan
avg. cols/sent:        nan


Let me know if I made mistake somewhere.

Thanks
Linh



Chris Dyer wrote:

 I am still confused about the lattice format,
 In your examples:

 1 ((('A',1.0,1),),(('B',1.0,1),),)
 2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),)

 Can I interpret it as:
 from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and
 (('B',1.0,1),)

Each entire lattice is encoded on a single line.  In line 1, there are
two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of
the arc and the "1" is the length of the arc (measured in nodes).  In
line two, node 0 has two arcs, arc 'A' that goes to node 1 and arc 'Z'
that goes to node 2. Node 1 has a single arc, 'B', that goes to node
2.  Node 2 has a single arc 'C' that goes to 3.

 And also  what are the meaning of number 1.0 and 1, 2 there? where can I put 
the lattice probabilities?
 Is it possible to add an empty lattice (so that the decoder skip a word)?

Currently, moses only lets you specify a single cost for an arc, and
it is actually treated as a probability (the decoder sees it as
-log(p) -- you can change this in WordLattice.cpp if you want to deal
with more conventional costs, but the rest of the inputs to the
decoder are given as probabilities so I wanted to be consistent).  If
you want a null transition, set the arc label to '*eps*' and the
decoder will treat this as a null.

--Chris


 Linh




 Chris Dyer wrote:

 >Also, if you are using general lattices (as opposed to regular
 >confusion networks) as input, you should update to the latest version
 >of the decoder from Subversion, since I checked in a fairly crucial
 >bug fix yesterday.
 >
 >Chris
 >
 >On Wed, Feb 20, 2008 at 4:37 PM, Chris Dyer <[EMAIL PROTECTED]> wrote:
 >
 >
 >>The lattice format isn't documented yet on the webpage, but you can
 >> see some examples of it in the lattice-distortion test directory Hieu
 >> mentions.  It should be fairly straightforward to decipher.  Since
 >> this format encodes a single lattice/CN per line of text, it can be
 >> used easily with MER training.
 >>
 >> Chris
 >>
 >>
 >>
 >> On Wed, Feb 20, 2008 at 4:30 PM, Hieu Hoang <[EMAIL PROTECTED]> wrote:
 >> > chao anh/chi linh
 >> >
 >> > i'm not sure if anyone answered your question and i'm prob not the best
 >> > person to answer question on lattice/confusion net input. to my knowledge,
 >> > mert should run fine with these input types.
 >> >
 >> > perhaps you can find an example of the lattice input format from the
 >> > regression test :
 >> >
 >> > 
http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/regression-testing/tests/
 >> >
 >> >
 >> >
 >> > ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote:
 >> >
 >> >
 >> > -------- Original Message --------
 >> > Subject: Run mert-moses.pl with confusion network
 >> > Date: Sat, 16 Feb 2008 21:33:44 -0500
 >> > From: ThuyLinh Nguyen
 >> > To: moses-support@mit.edu
 >> >
 >> >
 >> >
 >> > Hello,
 >> > I want to run mer for a development set which is the output of other
 >> > translation job.
 >> > therefore the development input is a set of lattices. Are there anyway
 >> > to run MER with lattice input and if so how can i represent the lattice
 >> > of multiple sentences?
 >> > Thank you
 >> > Linh
 >> >
 >> >
 >> > _______________________________________________
 >> > Moses-support mailing list
 >> > Moses-support@mit.edu
 >> > http://mailman.mit.edu/mailman/listinfo/moses-support
 >> >
 >> >
 >> >
 >> >
 >> > Hieu Hoang
 >> > http//www.hoang.co.uk/hieu
 >> >
 >> >  ________________________________
 >> >
 >> >  Sent from Yahoo! - a smarter inbox.
 >> > _______________________________________________
 >> >  Moses-support mailing list
 >> >  Moses-support@mit.edu
 >> >  http://mailman.mit.edu/mailman/listinfo/moses-support
 >> >
 >> >
 >>
 >>
 >>
 >
 >
 >

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

Reply via email to