Re: [Moses-support] Fwd: Different translations are obtained from the same decoder without alignment information

Tom Hoar Fri, 24 Aug 2018 07:58:46 -0700

I remember 3 years ago, I reported a similar (same?) problem with--print-alignment-inf flag, without EMS. The time, I was using thelegacy binarized translation and reordering table and everything wasgreat. Then, I started testing the compact binarized format. The flagcaused translations to change and some were even lost (blank lines). Noone on the support list knew of any reason and I didn't have bandwidthto troubleshoot. Instead, I continued using the legacy binarized files.Maybe try changing to the legacy binarized files and see if the problemdisappears. This could help you narrow-down where to look.


Best regards,
Tom Hoar
*Slate Rocks, LLC*
Web: https://www.slate.rocks
Thailand Mobile: +66 87 345-1875 <tel:+66873451875>
Skype: tahoar <skype:tahoar?call>

On 8/24/2018 9:31 PM, moses-support-requ...@mit.edu wrote:

Date: Fri, 24 Aug 2018 15:31:14 +0100
From: Hieu Hoang<hieuho...@gmail.com>
Subject: Re: [Moses-support] Fwd: Different translations are obtained
        from the same decoder without alignment information
To: Ergun Bicici<bic...@gmail.com>
Cc: moses-support<moses-support@mit.edu>
Message-ID:
        <caekmkbhwykypzsqdsl-wcglqwjsydeaxbgvntkbpc17e7zu...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

could you run with alignments, but WITHOUT -unknown-word-prefix UNK.

alignments shouldn't change the translation but the OOV prefix may do

Hieu Hoang
http://statmt.org/hieu


On Fri, 24 Aug 2018 at 15:29, Ergun Bicici<bic...@gmail.com>  wrote:

ok, thank you. I'll upload and send you a link.

On Fri, Aug 24, 2018 at 5:27 PM Hieu Hoang<hieuho...@gmail.com>  wrote:

that would be a bug.

could you please make the model and input files available for download.
I'll check it out

Hieu Hoang
http://statmt.org/hieu


On Fri, 24 Aug 2018 at 15:15, Ergun Bicici<bic...@gmail.com>  wrote:

only the evaluation decoding steps are repeated that are steps 10, 9,
and 7 in the following steps in EMS output:
48 TRAINING:consolidate ->      re-using (1)
47 TRAINING:prepare-data ->     re-using (1)
46 TRAINING:run-giza -> re-using (1)
45 TRAINING:run-giza-inverse -> re-using (1)
44 TRAINING:symmetrize-giza ->  re-using (1)
43 TRAINING:build-lex-trans ->  re-using (1)
40 TRAINING:build-osm ->        re-using (1)
39 TRAINING:extract-phrases ->  re-using (1)
38 TRAINING:build-reordering -> re-using (1)
37 TRAINING:build-ttable ->     re-using (1)
34 TRAINING:create-config ->    re-using (1)
28 TUNING:truecase-input ->     re-using (1)
24 TUNING:truecase-reference -> re-using (1)
21 TUNING:filter ->     re-using (1)
20 TUNING:apply-filter ->       re-using (1)
19 TUNING:tune ->       re-using (1)
18 TUNING:apply-weights ->      re-using (1)
15 EVALUATION:test:truecase-input ->    re-using (1)
12 EVALUATION:test:filter ->    re-using (1)
11 EVALUATION:test:apply-filter ->      re-using (1)



*10 EVALUATION:test:decode ->    run 9 EVALUATION:test:remove-markup ->
      run 7 EVALUATION:test:detruecase-output ->  run *3
EVALUATION:test:multi-bleu-c ->       run
2 EVALUATION:test:analysis-coverage ->  re-using (1)
1 EVALUATION:test:analysis-precision -> run


On Fri, Aug 24, 2018 at 4:39 PM Hieu Hoang<hieuho...@gmail.com>  wrote:

are you rerunning tuning for each case? Or are you using exactly the
same moses.ini file for the with and with alignment experiments?

Hieu Hoang
http://statmt.org/hieu


On Fri, 24 Aug 2018 at 14:34, Ergun Bicici<bic...@gmail.com>  wrote:

Dear Moses maintainers,

I discovered that the translations obtained differ when alignment
flags (--mark-unknown --unknown-word-prefix UNK --print-alignment-inf)
are used. Comparison table is attached (en-ru and ru-en are being
recomputed). We expect them to be the same since alignment flags only print
additional information and they are not supposed to alter decoding. In
both, the same EMS system was re-run with the alignment information flags
or not.

    - Average of the absolute difference is 0.0094 BLEU (about 1 BLEU
    points).
    - Average of the difference is 0.0051 BLEU (about 0.5 BLEU points,
    results are better with alignment flags).

?

/opt/Programs/SMT/moses/mosesdecoder/bin/moses --version

Moses code version (git tag or commit hash):
   mmt-mvp-v0.12.1-2775-g65c75ff07-dirty
Libraries used:
      Boost  version 1.62.0

git status
On branch RELEASE-4.0
Your branch is up to date with 'origin/RELEASE-4.0'.


Note: Using alignment information to recase tokens was tried in [1]
for en-fi and en-tr to claim positive results. We tried this method in all
translation directions we considered as as can be seen in the align row,
this only improves the performance for tr-en and en-tr and for tr-en Moses
provides better translations without the alignment flags.
[1]The JHU Machine Translation Systems for WMT 2016
Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt Post
http://www.statmt.org/wmt16/pdf/W16-2310.pdf


Best Regards,
Ergun

Ergun Bi?ici
http://bicici.github.com/  <http://ergunbicici.blogspot.com/>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

--

Regards,
Ergun

--

Regards,
Ergun

-------------- next part --------------
An HTML attachment was scrubbed...
URL:http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/2bd1c008/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 59618 bytes
Desc: not available
Url 
:http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/2bd1c008/attachment.png

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Fwd: Different translations are obtained from the same decoder without alignment information

Reply via email to