Hi, I have trained a SMT model using Moses on my own data. My goal is to build an incremental model so I can later on add more data. I have followed the instructions in Moses web page about incremental training. My data is preprocessed and prepared as it says. However, when trying to update and compute the new alignments I get the following error which I can't really understand.
[sent:2900000] Reading more sentence pairs into memory ... Reading more sentence pairs into memory ... [sent:3000000] Reading more sentence pairs into memory ... Reading more sentence pairs into memory ... [sent:3100000] Reading more sentence pairs into memory ... Reading more sentence pairs into memory ... Reading more sentence pairs into memory ... Model1: (1) TRAIN CROSS-ENTROPY 15.768 PERPLEXITY 55801.4 Model1: (1) VITERBI TRAIN CROSS-ENTROPY 19.1387 PERPLEXITY 577188 Model 1 Iteration: 1 took: 811 seconds Entire Model1 Training took: 811 seconds Loading HMM alignments from file. *** Error in `/opt/inc-giza-pp/GIZA++-v2/GIZA++': malloc(): memory corruption: 0x0000000089e29700 *** ======= Backtrace: ========= [0x5bbe01] [0x5c605a] [0x5c7fe1] [0x4e3288] [0x4a0dad] [0x4a3816] [0x4a430c] [0x49890e] [0x436bb4] [0x40396b] [0x598f56] [0x59914a] [0x404ad9] ======= Memory map: ======== 00400000-0072d000 r-xp 00000000 08:02 3278867 /opt/inc-giza-pp/GIZA++-v2/GIZA++ 0092c000-00936000 rw-p 0032c000 08:02 3278867 /opt/inc-giza-pp/GIZA++-v2/GIZA++ 00936000-00940000 rw-p 00000000 00:00 0 01cfe000-17a52d000 rw-p 00000000 00:00 0 [heap] 7f0894000000-7f089402d000 rw-p 00000000 00:00 0 7f089402d000-7f0898000000 ---p 00000000 00:00 0 7f089a237000-7f08a21f0000 rw-p 00000000 00:00 0 7f08a3bdf000-7f08a5dbc000 rw-p 00000000 00:00 0 7ffdad243000-7ffdad264000 rw-p 00000000 00:00 0 [stack] 7ffdad3b9000-7ffdad3bc000 r--p 00000000 00:00 0 [vvar] 7ffdad3bc000-7ffdad3be000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] 3-update-alingments.sh: line 2: 3236 Aborted (core dumped) /opt/inc-giza-pp/GIZA++-v2/GIZA++ giza.conf.2 I don't know if it is a GIZA++ issue (it 's the incremental GIZA adaptation) or is something related to the previous data preparations steps. The following instructions can be found in the web page regarding data preparation. However, it is not clear to me whether those two files mentioned in the last paragraph are in the correct order. I mean, should I use the first file for the first command and so for the second or do I need to take into account the source-target order? Maybe this is related to the error mentioned above. snt2cooc $ $INC_GIZA_PP/bin/snt2cooc.out <new-source-vcb> <new-target-vcb> <new-source_target.snt> \ <previous-source-target.cooc > new.source-target.cooc $ $INC_GIZA_PP/bin/snt2cooc.out <new-target-vcb> <new-source-vcb> <new-target_source.snt> \ <previous-target-source.cooc > new.target-source.cooc This commands is run once in the source-target direction, and once in the target-source direction. The previous cooccurrence files can be found in <experiment-dir>/training/giza.<run>/<target-lang>-<source-lang>.cooc and <experiment-dir>/training/giza-inverse.<run>/ <source-lang>-<target-lang>.cooc. Thank you in advance. *Ander Corral Naves* ITZULPENGINTZARAKO TEKNOLOGIAK [image: https://www.linkedin.com/in/itziar-cort%C3%A9s-9b725838/] <https://www.linkedin.com/in/itziar-cort%C3%A9s-9b725838/> <https://twitter.com/elhuyarig> <https://www.youtube.com/user/ElhuyarFundazioa1> <https://es-es.facebook.com/elhuyar.fundazioa> *a.cor...@elhuyar.eus <a.cor...@elhuyar.eus>* Tel.: 943363040 | luzp.: 200 Zelai Haundi, 3 Osinalde industrialdea 20170 Usurbil *www.elhuyar.eus* <http://www.elhuyar.eus/> <https://www.elhuyar.eus/eu/site/komunitatea/laguntzaileak/elhuyarkide-izan>
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support