Hi Chris,
Thanks for clarification, so the lattice format is different with
confusion network format
but in moses binary, there are only two options for inputtype:
-inputtype: text (0) or confusion network (1)
It does n't recognize the lattice format input.
This is an example of lattice translation error:
echo "((('A',1.0,1),),(('B',1.0,1),),)" | moses -f moses.ini -inputtype 1
Defined parameters (per moses.ini or switch):
config: moses.ini
distortion-limit: 6
input-factors: 0
inputtype: 1
lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm
mapping: 0 T 0
ttable-file: 0 0 5
/SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
ttable-limit: 20 0
weight-d: 0.6
weight-l: 0.5000
weight-t: 0.2 0.2 0.2 0.2 0.2
weight-w: -1
Loading lexical distortion models...
have 0 models
Start loading LanguageModel
/SMT/Workplace/Linh/IWSLT_0802/train.en.srilm : [0.000] seconds
Finished loading LanguageModels : [0.000] seconds
Start loading PhraseTable
/SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin :
[0.000] seconds
using binary phrase tables for idx 0
reading bin ttable
size of OFF_T 8
binary phrasefile loaded, default OFF_T: -1
Finished loading phrase tables : [0.000] seconds
IO from STDOUT/STDIN
Created input-output object : [0.000] seconds
read confusion net with format 0
End. : [0.000] seconds
confusion net statistics:
created: 1
destroyed: 1
succ. read: 0
columns: 0
words: 0
avg. word/column: nan
avg. cols/sent: nan
Let me know if I made mistake somewhere.
Thanks
Linh
Chris Dyer wrote:
I am still confused about the lattice format,
In your examples:
1 ((('A',1.0,1),),(('B',1.0,1),),)
2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),)
Can I interpret it as:
from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and
(('B',1.0,1),)
Each entire lattice is encoded on a single line. In line 1, there are
two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of
the arc and the "1" is the length of the arc (measured in nodes). In
line two, node 0 has two arcs, arc 'A' that goes to node 1 and arc 'Z'
that goes to node 2. Node 1 has a single arc, 'B', that goes to node
2. Node 2 has a single arc 'C' that goes to 3.
And also what are the meaning of number 1.0 and 1, 2 there? where can I put
the lattice probabilities?
Is it possible to add an empty lattice (so that the decoder skip a word)?
Currently, moses only lets you specify a single cost for an arc, and
it is actually treated as a probability (the decoder sees it as
-log(p) -- you can change this in WordLattice.cpp if you want to deal
with more conventional costs, but the rest of the inputs to the
decoder are given as probabilities so I wanted to be consistent). If
you want a null transition, set the arc label to '*eps*' and the
decoder will treat this as a null.
--Chris
Linh
Chris Dyer wrote:
>Also, if you are using general lattices (as opposed to regular
>confusion networks) as input, you should update to the latest version
>of the decoder from Subversion, since I checked in a fairly crucial
>bug fix yesterday.
>
>Chris
>
>On Wed, Feb 20, 2008 at 4:37 PM, Chris Dyer <[EMAIL PROTECTED]> wrote:
>
>
>>The lattice format isn't documented yet on the webpage, but you can
>> see some examples of it in the lattice-distortion test directory Hieu
>> mentions. It should be fairly straightforward to decipher. Since
>> this format encodes a single lattice/CN per line of text, it can be
>> used easily with MER training.
>>
>> Chris
>>
>>
>>
>> On Wed, Feb 20, 2008 at 4:30 PM, Hieu Hoang <[EMAIL PROTECTED]> wrote:
>> > chao anh/chi linh
>> >
>> > i'm not sure if anyone answered your question and i'm prob not the best
>> > person to answer question on lattice/confusion net input. to my knowledge,
>> > mert should run fine with these input types.
>> >
>> > perhaps you can find an example of the lattice input format from the
>> > regression test :
>> >
>> >
http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/regression-testing/tests/
>> >
>> >
>> >
>> > ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> > -------- Original Message --------
>> > Subject: Run mert-moses.pl with confusion network
>> > Date: Sat, 16 Feb 2008 21:33:44 -0500
>> > From: ThuyLinh Nguyen
>> > To: moses-support@mit.edu
>> >
>> >
>> >
>> > Hello,
>> > I want to run mer for a development set which is the output of other
>> > translation job.
>> > therefore the development input is a set of lattices. Are there anyway
>> > to run MER with lattice input and if so how can i represent the lattice
>> > of multiple sentences?
>> > Thank you
>> > Linh
>> >
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>> >
>> >
>> > Hieu Hoang
>> > http//www.hoang.co.uk/hieu
>> >
>> > ________________________________
>> >
>> > Sent from Yahoo! - a smarter inbox.
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>>
>>
>>
>
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support