Dear Tom, Thank you for sharing your finding. This does not apply in this case since I re-compiled the code to build the initial Moses 4.0 model. Then moses binary is not changed and even though I am observing different scores, they are better when the alignment flags are included. I am waiting for de-en results with "-print-alignment-info" flag.
I tried to debug some decentralized Moses server-client model before that was encountering similar symptoms where the error could source from additional sources such as the network being interrupted, issues with the syncing of buffers etc. With a binarized version you get a translation, but the translation options are somewhat fixed. Could Moses provide a better translation? Turns out that truecasing before detruecasing improves the scores by 0.002 BLEU for instance on average of 8 translation directions in WMT18. Regards, Ergun bicici.github.com On Fri, Aug 24, 2018 at 5:55 PM Tom Hoar <tahoar@slate.rocks> wrote: > I remember 3 years ago, I reported a similar (same?) problem with > --print-alignment-inf flag, without EMS. The time, I was using the legacy > binarized translation and reordering table and everything was great. Then, > I started testing the compact binarized format. The flag caused > translations to change and some were even lost (blank lines). No one on the > support list knew of any reason and I didn't have bandwidth to > troubleshoot. Instead, I continued using the legacy binarized files. Maybe > try changing to the legacy binarized files and see if the problem > disappears. This could help you narrow-down where to look. > > Best regards, > Tom Hoar > *Slate Rocks, LLC* > Web: https://www.slate.rocks > Thailand Mobile: +66 87 345-1875 <+66873451875> > Skype: tahoar > > On 8/24/2018 9:31 PM, moses-support-requ...@mit.edu wrote: > > Date: Fri, 24 Aug 2018 15:31:14 +0100 > From: Hieu Hoang <hieuho...@gmail.com> <hieuho...@gmail.com> > Subject: Re: [Moses-support] Fwd: Different translations are obtained > from the same decoder without alignment information > To: Ergun Bicici <bic...@gmail.com> <bic...@gmail.com> > Cc: moses-support <moses-support@mit.edu> <moses-support@mit.edu> > Message-ID: > <caekmkbhwykypzsqdsl-wcglqwjsydeaxbgvntkbpc17e7zu...@mail.gmail.com> > <caekmkbhwykypzsqdsl-wcglqwjsydeaxbgvntkbpc17e7zu...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > could you run with alignments, but WITHOUT -unknown-word-prefix UNK. > > alignments shouldn't change the translation but the OOV prefix may do > > Hieu Hoanghttp://statmt.org/hieu > > > On Fri, 24 Aug 2018 at 15:29, Ergun Bicici <bic...@gmail.com> > <bic...@gmail.com> wrote: > > > ok, thank you. I'll upload and send you a link. > > On Fri, Aug 24, 2018 at 5:27 PM Hieu Hoang <hieuho...@gmail.com> > <hieuho...@gmail.com> wrote: > > > that would be a bug. > > could you please make the model and input files available for download. > I'll check it out > > Hieu Hoanghttp://statmt.org/hieu > > > On Fri, 24 Aug 2018 at 15:15, Ergun Bicici <bic...@gmail.com> > <bic...@gmail.com> wrote: > > > only the evaluation decoding steps are repeated that are steps 10, 9, > and 7 in the following steps in EMS output: > 48 TRAINING:consolidate -> re-using (1) > 47 TRAINING:prepare-data -> re-using (1) > 46 TRAINING:run-giza -> re-using (1) > 45 TRAINING:run-giza-inverse -> re-using (1) > 44 TRAINING:symmetrize-giza -> re-using (1) > 43 TRAINING:build-lex-trans -> re-using (1) > 40 TRAINING:build-osm -> re-using (1) > 39 TRAINING:extract-phrases -> re-using (1) > 38 TRAINING:build-reordering -> re-using (1) > 37 TRAINING:build-ttable -> re-using (1) > 34 TRAINING:create-config -> re-using (1) > 28 TUNING:truecase-input -> re-using (1) > 24 TUNING:truecase-reference -> re-using (1) > 21 TUNING:filter -> re-using (1) > 20 TUNING:apply-filter -> re-using (1) > 19 TUNING:tune -> re-using (1) > 18 TUNING:apply-weights -> re-using (1) > 15 EVALUATION:test:truecase-input -> re-using (1) > 12 EVALUATION:test:filter -> re-using (1) > 11 EVALUATION:test:apply-filter -> re-using (1) > > > > *10 EVALUATION:test:decode -> run 9 EVALUATION:test:remove-markup -> > run 7 EVALUATION:test:detruecase-output -> run *3 > EVALUATION:test:multi-bleu-c -> run > 2 EVALUATION:test:analysis-coverage -> re-using (1) > 1 EVALUATION:test:analysis-precision -> run > > > On Fri, Aug 24, 2018 at 4:39 PM Hieu Hoang <hieuho...@gmail.com> > <hieuho...@gmail.com> wrote: > > > are you rerunning tuning for each case? Or are you using exactly the > same moses.ini file for the with and with alignment experiments? > > Hieu Hoanghttp://statmt.org/hieu > > > On Fri, 24 Aug 2018 at 14:34, Ergun Bicici <bic...@gmail.com> > <bic...@gmail.com> wrote: > > > Dear Moses maintainers, > > I discovered that the translations obtained differ when alignment > flags (--mark-unknown --unknown-word-prefix UNK --print-alignment-inf) > are used. Comparison table is attached (en-ru and ru-en are being > recomputed). We expect them to be the same since alignment flags only print > additional information and they are not supposed to alter decoding. In > both, the same EMS system was re-run with the alignment information flags > or not. > > - Average of the absolute difference is 0.0094 BLEU (about 1 BLEU > points). > - Average of the difference is 0.0051 BLEU (about 0.5 BLEU points, > results are better with alignment flags). > > ? > > /opt/Programs/SMT/moses/mosesdecoder/bin/moses --version > > Moses code version (git tag or commit hash): > mmt-mvp-v0.12.1-2775-g65c75ff07-dirty > Libraries used: > Boost version 1.62.0 > > git status > On branch RELEASE-4.0 > Your branch is up to date with 'origin/RELEASE-4.0'. > > > Note: Using alignment information to recase tokens was tried in [1] > for en-fi and en-tr to claim positive results. We tried this method in all > translation directions we considered as as can be seen in the align row, > this only improves the performance for tr-en and en-tr and for tr-en Moses > provides better translations without the alignment flags. > [1]The JHU Machine Translation Systems for WMT 2016 > Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt > Posthttp://www.statmt.org/wmt16/pdf/W16-2310.pdf > > > Best Regards, > Ergun > > Ergun Bi?icihttp://bicici.github.com/ <http://ergunbicici.blogspot.com/> > <http://ergunbicici.blogspot.com/> > > _______________________________________________ > Moses-support mailing > listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support > > -- > > Regards, > Ergun > > > > > -- > > Regards, > Ergun > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/2bd1c008/attachment.html > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image.png > Type: image/png > Size: 59618 bytes > Desc: not available > Url : > http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/2bd1c008/attachment.png > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > -- Regards, Ergun
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support