On Friday 06 August 2010 11:44:25 Sylvain Raybaud wrote:
> Hello
> 
>   A few days ago I came to you asking if someone knew of a script to turn
> word lattice from SLF (HTK format) into PLF (Moses input format). It seems
> that doesn't exist yet, so here is my proposal (python file attached).
> 
> I tried to include all necessary information in help message and in file
> header.
> 
> basic use :
> slf2plf input > output
> slf2plf < input > output
> slf2plf -h for help
> 
> it works on small "toy" lattices, I checked the results. I've run it on
> real world lattices but I have not checked the results yet.
> 
> I tried to keep as much flexibility of SLF format as possible, but
> sublattices won't work yet (they will be parsed but in the resulting graph
> they will appear as a single node. I haven't checked this though).
> 
> Also, for the moment the software requires that words are specified on
> links (not nodes) and that acoustic and language model scores are present.
> The resulting score in output graph is:
> acoustic+language*lmscale
> 
> all testing, remarks and requests welcome (of course). I can make the
> script accessible on a mercurial repository if you want.
> 
> Hope this helps
> 
> regards,

some more testing:

  * moses is able to translate simple toy lattices generated with this tool 
(see 'example.*' files attached)
  * added a feature for removing noise information from input graph (can be 
disabled with -C, see help)
  * moses cannot translate "real world" graph generated with this tool.
See for example: http://perso.crans.org/raybaud/realworld.slf.gz
Here is the output:

sylv...@markov /global/markov/raybauds/XP/S2T $ zcat realworld.slf.gz | 
~/loria/sphinx2plf/slf2plf.py | nice moses -inputtype 2 -weight-i 1 -f 
working-dir/filtered-OlliRehn-1/moses.ini > out
Defined parameters (per moses.ini or switch):
        config: working-dir/filtered-OlliRehn-1/moses.ini 
        distortion-file: 0-0 msd-bidirectional-fe 6 
/global/markov/raybauds/XP/S2T/working-dir/filtered-OlliRehn-1/reordering-table 
        distortion-limit: 6 
        input-factors: 0 
        inputtype: 2 
        lmodel-file: 0 0 5 /home/sylvain/softs/moses/working-dir/lm/europarl.lm 
        mapping: 0 T 0 
        ttable-file: 0 0 5 /global/markov/raybauds/XP/S2T/working-dir/filtered-
OlliRehn-1/phrase-table 
        ttable-limit: 20 
        weight-d: 0.054012 0.196064 -0.003181 0.081071 0.086735 0.062332 
0.090617 
        weight-i: 1 
        weight-l: 0.064254 
        weight-t: 0.013876 0.081935 0.059305 0.013773 0.145934 
        weight-w: 0.046909 
Loading lexical distortion models...
have 1 models
Creating lexical reordering...
weights: 0.196 -0.003 0.081 0.087 0.062 0.091 
binary file loaded, default OFF_T: -1
Created lexical orientation reordering
Start loading LanguageModel /home/sylvain/softs/moses/working-
dir/lm/europarl.lm : [0.000] seconds
Finished loading LanguageModels : [13.000] seconds
specified equal numbers of link parameters and insertion weights, not using 
non-epsilon 'real' word link count.
Start loading PhraseTable /global/markov/raybauds/XP/S2T/working-dir/filtered-
OlliRehn-1/phrase-table : [13.000] seconds
filePath: /global/markov/raybauds/XP/S2T/working-dir/filtered-OlliRehn-1/phrase-
table
using binary phrase tables for idx 0
reading bin ttable
size of OFF_T 8
binary phrasefile loaded, default OFF_T: -1
Finished loading phrase tables : [13.000] seconds
IO from STDOUT/STDIN
Created input-output object : [13.000] seconds
PCN/PLF parse error: expected , after string
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
PCN/PLF parse error: expected ( at start of cn alt block
End. : [14.000] seconds
confusion net statistics:
 created:       1
 destroyed:     1
 succ. read:    0
 columns:       0
 words: 0
 avg. word/column:      -nan
 avg. cols/sent:        -nan



I'm investigating it, but being unable to reproduce on small graphs makes it 
difficult to track the bug.

regards,

-- 
Sylvain
VERSION=1.1
UTTERANCE=1
lmscale=1
wdpenalty=0.0
#
# Size line
#
N=5   L=5
#
# Node definitions
#
I=0
I=1
I=2
I=3
I=4

#
# Link definitions.
#
J=0     S=0    E=1    W=bonjour a=0     l=0
J=1     S=1    E=2    W=monsieur        a=0     l=0
J=2     S=2    E=3    W=le      a=0     l=0
J=3     S=3    E=4    W=président       a=0     l=0
J=4     S=0    E=2    W=monsieur        a=0     l=0
mr the president 
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to