Dear Christian,

As the binary phrase table (PT) is generated from the textual one,
we assumed that the latter exists,
so the check was done only on the textual PT.

If I needed to save space I deleted the textual PT (and not the binaries)
and recreated an almost empty PT with the same name
(containing only one line like "EMPTY PHRASE TABLE")


Your solution is absolutely smarter.
I will update on the repository.

thanks for the suggestion

best,
Nicola



________________________________________
From: Christian Hardmeier [EMAIL PROTECTED]
Sent: Thursday, February 28, 2008 11:20 AM
To: Nicola Bertoldi
Subject: Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

Hi Nicola,

I think one of the problems mentioned below is still there:
mert-moses.pl will complain about finding no phrasetable if the
phrasetable is in binary format. Changing line 1084 to

if (! -s $fn && ! -s "$fn.gz" && ! -s "$fn.binphr.tgtdata") {

(or something similar) would fix that, I think.

Best,
Christian

On Thu, 28 Feb 2008, Nicola Bertoldi wrote:

> moses-parallel.pl and mert-moses.pl were changed .
> Now they works well with lattice inputs, too.
>
> Notice that you do NOT need to specify
> -decoder-flags "-inputtype 2"
> the parameter
> --inputtype 2
> of mert-moses.pl is passed to the decoder automatically.
>
>
> best,
> Nicola
>
>
> ________________________________________
> From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of ThuyLinh Nguyen [EMAIL 
> PROTECTED]
> Sent: Wednesday, February 27, 2008 5:16 PM
> To: moses-support@mit.edu
> Subject: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]
>
> Hello,
> just another mistake, mert-moses.pl can't find the phrasetable in binary
> format but if run translation without mert, it works
> here is the error:
> perl mert-moses.pl ../../sstmorph/dev.ar.lattice ../../dev.en.process
> ../../../moses-cmd/src/moses ./moses.ini --decoder-flags "-inputtype 2"
> --inputtype 2 --rootdir
> /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts --no-filter-phrase-table
> After default: -l mem_free=0.5G -hard
> Using SCRIPTS_ROOTDIR: /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts
> checking weight-count for ttable-file
> moses.ini:15:File does not exist or empty:
> /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
> checking weight-count for lmodel-file
> SYNC distortionExit 1
>
> but if I run without mer, it works
> head -2 ../../sstmorph/dev.ar.lattice | ../../../moses-cmd/src/moses -f
> ./moses.ini -inputtype 2
>
> Thanks
> Linh
>
> Chris Dyer wrote:
> > I'll update that- the inputtype should be "2" for lattices...
> > Chris
> >
> > On Wed, Feb 27, 2008 at 4:39 AM, ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote:
> >
> >>  Hi Chris,
> >>  Thanks for clarification, so the lattice format is different with 
> >> confusion
> >> network format
> >>  but in moses binary, there are only two options for  inputtype: 
> >> -inputtype:
> >> text (0) or confusion network (1)
> >>
> >>  It does n't recognize the lattice format input.
> >>  This is an example of lattice translation error:
> >>
> >>  echo "((('A',1.0,1),),(('B',1.0,1),),)" | moses -f moses.ini -inputtype 1
> >>  Defined parameters (per moses.ini or switch):
> >>          config: moses.ini
> >>          distortion-limit: 6
> >>          input-factors: 0
> >>          inputtype: 1
> >>          lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm
> >>          mapping: 0 T 0
> >>          ttable-file: 0 0 5
> >> /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
> >>          ttable-limit: 20 0
> >>          weight-d: 0.6
> >>          weight-l: 0.5000
> >>          weight-t: 0.2 0.2 0.2 0.2 0.2
> >>          weight-w: -1
> >>  Loading lexical distortion models...
> >>  have 0 models
> >>  Start loading LanguageModel /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm 
> >> :
> >> [0.000] seconds
> >>  Finished loading LanguageModels : [0.000] seconds
> >>  Start loading PhraseTable
> >> /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin :
> >> [0.000] seconds
> >>  using binary phrase tables for idx 0
> >>  reading bin ttable
> >>  size of OFF_T 8
> >>  binary phrasefile loaded, default OFF_T: -1
> >>  Finished loading phrase tables : [0.000] seconds
> >>  IO from STDOUT/STDIN
> >>  Created input-output object : [0.000] seconds
> >>  read confusion net with format 0
> >>  End. : [0.000] seconds
> >>  confusion net statistics:
> >>   created:       1
> >>   destroyed:     1
> >>   succ. read:    0
> >>   columns:       0
> >>   words: 0
> >>   avg. word/column:      nan
> >>   avg. cols/sent:        nan
> >>
> >>
> >>  Let me know if I made mistake somewhere.
> >>
> >>  Thanks
> >>  Linh
> >>
> >>
> >>
> >>
> >>
> >>  Chris Dyer wrote:
> >>
> >>  I am still confused about the lattice format,
> >>  In your examples:
> >>
> >>  1 ((('A',1.0,1),),(('B',1.0,1),),)
> >>  2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),)
> >>
> >>  Can I interpret it as:
> >>  from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and
> >>  (('B',1.0,1),)
> >>
> >>  Each entire lattice is encoded on a single line. In line 1, there are
> >> two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of
> >> the arc and the "1" is the length of the arc (measured in nodes). In
> >> line two, node 0 has two arcs, arc 'A' that goes to node 1 and arc 'Z'
> >> that goes to node 2. Node 1 has a single arc, 'B', that goes to node
> >> 2. Node 2 has a single arc 'C' that goes to 3.
> >>
> >>
> >>
> >>
> >>  And also what are the meaning of number 1.0 and 1, 2 there? where can I 
> >> put
> >> the lattice probabilities?
> >>  Is it possible to add an empty lattice (so that the decoder skip a word)?
> >>
> >>  Currently, moses only lets you specify a single cost for an arc, and
> >> it is actually treated as a probability (the decoder sees it as
> >> -log(p) -- you can change this in WordLattice.cpp if you want to deal
> >> with more conventional costs, but the rest of the inputs to the
> >> decoder are given as probabilities so I wanted to be consistent). If
> >> you want a null transition, set the arc label to '*eps*' and the
> >> decoder will treat this as a null.
> >>
> >> --Chris
> >>
> >>
> >>
> >>
> >>  Linh
> >>
> >>
> >>
> >>
> >>  Chris Dyer wrote:
> >>
> >>  >Also, if you are using general lattices (as opposed to regular
> >>  >confusion networks) as input, you should update to the latest version
> >>  >of the decoder from Subversion, since I checked in a fairly crucial
> >>  >bug fix yesterday.
> >>  >
> >>  >Chris
> >>  >
> >>  >On Wed, Feb 20, 2008 at 4:37 PM, Chris Dyer <[EMAIL PROTECTED]> wrote:
> >>  >
> >>  >
> >>  >>The lattice format isn't documented yet on the webpage, but you can
> >>  >> see some examples of it in the lattice-distortion test directory Hieu
> >>  >> mentions. It should be fairly straightforward to decipher. Since
> >>  >> this format encodes a single lattice/CN per line of text, it can be
> >>  >> used easily with MER training.
> >>  >>
> >>  >> Chris
> >>  >>
> >>  >>
> >>  >>
> >>  >> On Wed, Feb 20, 2008 at 4:30 PM, Hieu Hoang <[EMAIL PROTECTED]>
> >> wrote:
> >>  >> > chao anh/chi linh
> >>  >> >
> >>  >> > i'm not sure if anyone answered your question and i'm prob not the
> >> best
> >>  >> > person to answer question on lattice/confusion net input. to my
> >> knowledge,
> >>  >> > mert should run fine with these input types.
> >>  >> >
> >>  >> > perhaps you can find an example of the lattice input format from the
> >>  >> > regression test :
> >>  >> >
> >>  >> >
> >> http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/regression-testing/tests/
> >>  >> >
> >>  >> >
> >>  >> >
> >>  >> > ThuyLinh Nguyen <[EMAIL PROTECTED]> wrote:
> >>  >> >
> >>  >> >
> >>  >> > -------- Original Message --------
> >>  >> > Subject: Run mert-moses.pl with confusion network
> >>  >> > Date: Sat, 16 Feb 2008 21:33:44 -0500
> >>  >> > From: ThuyLinh Nguyen
> >>  >> > To: moses-support@mit.edu
> >>  >> >
> >>  >> >
> >>  >> >
> >>  >> > Hello,
> >>  >> > I want to run mer for a development set which is the output of other
> >>  >> > translation job.
> >>  >> > therefore the development input is a set of lattices. Are there 
> >> anyway
> >>  >> > to run MER with lattice input and if so how can i represent the
> >> lattice
> >>  >> > of multiple sentences?
> >>  >> > Thank you
> >>  >> > Linh
> >>  >> >
> >>  >> >
> >>  >> > _______________________________________________
> >>  >> > Moses-support mailing list
> >>  >> > Moses-support@mit.edu
> >>  >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >>  >> >
> >>  >> >
> >>  >> >
> >>  >> >
> >>  >> > Hieu Hoang
> >>  >> > http//www.hoang.co.uk/hieu
> >>  >> >
> >>  >> > ________________________________
> >>  >> >
> >>  >> > Sent from Yahoo! - a smarter inbox.
> >>  >> > _______________________________________________
> >>  >> > Moses-support mailing list
> >>  >> > Moses-support@mit.edu
> >>  >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >>  >> >
> >>  >> >
> >>  >>
> >>  >>
> >>  >>
> >>  >
> >>  >
> >>  >
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to