On Friday 06 August 2010 11:44:25 Sylvain Raybaud wrote: > Hello > > A few days ago I came to you asking if someone knew of a script to turn > word lattice from SLF (HTK format) into PLF (Moses input format). It seems > that doesn't exist yet, so here is my proposal (python file attached). > > I tried to include all necessary information in help message and in file > header. > > basic use : > slf2plf input > output > slf2plf < input > output > slf2plf -h for help > > it works on small "toy" lattices, I checked the results. I've run it on > real world lattices but I have not checked the results yet. > > I tried to keep as much flexibility of SLF format as possible, but > sublattices won't work yet (they will be parsed but in the resulting graph > they will appear as a single node. I haven't checked this though). > > Also, for the moment the software requires that words are specified on > links (not nodes) and that acoustic and language model scores are present. > The resulting score in output graph is: > acoustic+language*lmscale > > all testing, remarks and requests welcome (of course). I can make the > script accessible on a mercurial repository if you want. > > Hope this helps > > regards,
some more testing: * moses is able to translate simple toy lattices generated with this tool (see 'example.*' files attached) * added a feature for removing noise information from input graph (can be disabled with -C, see help) * moses cannot translate "real world" graph generated with this tool. See for example: http://perso.crans.org/raybaud/realworld.slf.gz Here is the output: sylv...@markov /global/markov/raybauds/XP/S2T $ zcat realworld.slf.gz | ~/loria/sphinx2plf/slf2plf.py | nice moses -inputtype 2 -weight-i 1 -f working-dir/filtered-OlliRehn-1/moses.ini > out Defined parameters (per moses.ini or switch): config: working-dir/filtered-OlliRehn-1/moses.ini distortion-file: 0-0 msd-bidirectional-fe 6 /global/markov/raybauds/XP/S2T/working-dir/filtered-OlliRehn-1/reordering-table distortion-limit: 6 input-factors: 0 inputtype: 2 lmodel-file: 0 0 5 /home/sylvain/softs/moses/working-dir/lm/europarl.lm mapping: 0 T 0 ttable-file: 0 0 5 /global/markov/raybauds/XP/S2T/working-dir/filtered- OlliRehn-1/phrase-table ttable-limit: 20 weight-d: 0.054012 0.196064 -0.003181 0.081071 0.086735 0.062332 0.090617 weight-i: 1 weight-l: 0.064254 weight-t: 0.013876 0.081935 0.059305 0.013773 0.145934 weight-w: 0.046909 Loading lexical distortion models... have 1 models Creating lexical reordering... weights: 0.196 -0.003 0.081 0.087 0.062 0.091 binary file loaded, default OFF_T: -1 Created lexical orientation reordering Start loading LanguageModel /home/sylvain/softs/moses/working- dir/lm/europarl.lm : [0.000] seconds Finished loading LanguageModels : [13.000] seconds specified equal numbers of link parameters and insertion weights, not using non-epsilon 'real' word link count. Start loading PhraseTable /global/markov/raybauds/XP/S2T/working-dir/filtered- OlliRehn-1/phrase-table : [13.000] seconds filePath: /global/markov/raybauds/XP/S2T/working-dir/filtered-OlliRehn-1/phrase- table using binary phrase tables for idx 0 reading bin ttable size of OFF_T 8 binary phrasefile loaded, default OFF_T: -1 Finished loading phrase tables : [13.000] seconds IO from STDOUT/STDIN Created input-output object : [13.000] seconds PCN/PLF parse error: expected , after string PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block PCN/PLF parse error: expected ( at start of cn alt block End. : [14.000] seconds confusion net statistics: created: 1 destroyed: 1 succ. read: 0 columns: 0 words: 0 avg. word/column: -nan avg. cols/sent: -nan I'm investigating it, but being unable to reproduce on small graphs makes it difficult to track the bug. regards, -- Sylvain
VERSION=1.1 UTTERANCE=1 lmscale=1 wdpenalty=0.0 # # Size line # N=5 L=5 # # Node definitions # I=0 I=1 I=2 I=3 I=4 # # Link definitions. # J=0 S=0 E=1 W=bonjour a=0 l=0 J=1 S=1 E=2 W=monsieur a=0 l=0 J=2 S=2 E=3 W=le a=0 l=0 J=3 S=3 E=4 W=président a=0 l=0 J=4 S=0 E=2 W=monsieur a=0 l=0
mr the president
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support