ok. I can't find the input sentence you gave me as an example

however, i think i know the issue. it's not a bug, but we haven't made clear that input and output sentences in phrase-based and chart decoders are slighty different.

In the chart decoder, there are implied <s> and </s> at the beginning and end of the each input and output sentence. They are not displayed, but the alignment still refers to them. So the input sentence
   "darya alexandrovna went alone to her room ."
has 8 words, but in the decoder, it's actually
  "<s> darya alexandrovna went alone to her room . </s>"

does that explain your problem?

On 02/11/2015 17:19, Arefeh Kazemi wrote:

Hi Hieu


This is the command:

(I've not tuned the system, so I use the initial moses.ini file)

the moses.ini is attached.


 nohup nice  ~/mosesdecoder/bin/moses_chart       \

   -f  ~ /moses.ini -alignment-output-file ~/align.txt  \

   < ~ /toyCorpus/mizan-test-toy.en                \

   > ~ /mizan-translated.fa        \

   2> ~ /mizan-test.out

 ~/mosesdecoder/scripts/generic/multi-bleu.perl \

   -lc ~ / toyCorpus /mizan-test-toy.fa             \

   < ~ /mizan-translated.fa


Thanks again

Arefeh


On 2 November 2015 at 20:07, Hieu Hoang <hieuho...@gmail.com <mailto:hieuho...@gmail.com>> wrote:

    err. What exactly do I run to reproduce the problem? What is the
    input? Which ini file? I don't need the extract  file or the corpus


    On 01/11/2015 11:41, Arefeh Kazemi wrote:
    ​
    extract.inv.sorted.gz
    
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjLVp6X2ZQNm5TUHM/view?usp=drive_web>
    ​​
    extract.sorted.gz
    
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjaHF6Nms5dEtJWFE/view?usp=drive_web>
    ​​
    other.zip
    
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjdFh2NkhIaVFEQVU/view?usp=drive_web>
    ​Hi Hieu
    Thanks for the reply.
    my original files are so huge so I attached a toy model which the
    mismatch happens for it too.

    Thanks again.
    Arefeh​
    toyCorpus.zip
    
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjS0dSSnA2NjVBTnc/view?usp=drive_web>
    ​​
    toyLM.zip
    
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjM2xiVkVYaDY3ZFk/view?usp=drive_web>
    ​

    On 1 November 2015 at 01:10, Hieu Hoang <hieuho...@gmail.com
    <mailto:hieuho...@gmail.com>> wrote:

        that should never happen. Can you please make available the
        model and input files for download so I can check it


        On 31/10/2015 10:30, Arefeh Kazemi wrote:
        Hi

        I needed the word alignment between the source and the
        output translation and I used -alignment-output-file
        parameter. It gives me an alignment file but there are some
        mismatches between the source sentences' length and the
        alignment so that the highest index in the alignment is
        greater than the sentence length.
        for example, for the source sentence
        "darya alexandrovna went alone to her room ."
         the alignment is :
        0-0 1-1 2-1 3-6 4-3 5-2 6-5 7-4 8-7 9-8

        I checked the sentences but there is no strange string in them.

        Does anyone know why this happens?!

        Regards
        Arefeh

        /

        *Email Disclaimer*

        /"This e-mail and any files transmitted with it are
        confidential and are intended solely for use by the
        addressee. Any unauthorised dissemination, distribution or
        copying of this message and any attachments is strictly
        prohibited. If you have received this e-mail in error,
        please notify the sender and delete the message. Any views
        or opinions presented in this e-mail may solely be the views
        of the author and cannot be relied upon as being those of
        Dublin City University. E-mail communications such as this
        cannot be guaranteed to be virus-free, timely, secure or
        error-free and Dublin City University does not accept
        liability for any such matters o r their cons equences.
        Please consider the environment before printing this e-mail."/

        *Séanadh Ríomhphoist*

        /"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi
        rún agus is lena úsáid ag an seolaí agus sin amháin é. Tá
        cosc iomlán ar scaipeadh, dháileadh nó chóipeáil
        neamhúdaraithe ar an teachtaireacht seo agus ar aon
        cheangaltán atá ag dul leis. Má tá an ríomhphost seo faighte
        agat trí dhearmad cuir sin in iúl le do thoil don seoltóir
        agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad
        tuairimí an údair agus sin amháin atá in aon tuairimí no
        dearcthaí atá curtha i láthair sa ríomhphost seo agus níor
        chóir glacadh leo mar thuairimí nó dhearcthaí Ollscoil
        Chathair Bhaile Átha Cliath. Ní ghlactar leis go bhfuil
        cumarsáid ríomhphoist den sórt seo saor ó víreas, in am,
        slán, nó saor ó earráid agus ní ghlacann Olls coil Chathair
        Bhaile Átha Cliath le dliteanas in aon chás den sórt sin ná
        as aon iarmhairt a d’eascródh astu. Cuimhnigh ar an
        timpeallacht le do thoil sula gcuireann tú an ríomhphost seo
        i gcló."/

        /


        _______________________________________________
        Moses-support mailing list
        Moses-support@mit.edu <mailto:Moses-support@mit.edu>
        http://mailman.mit.edu/mailman/listinfo/moses-support

-- Hieu Hoang
        http://www.hoang.co.uk/hieu



    /

    *Email Disclaimer*

    /"This e-mail and any files transmitted with it are confidential
    and are intended solely for use by the addressee. Any
    unauthorised dissemination, distribution or copying of this
    message and any attachments is strictly prohibited. If you have
    received this e-mail in error, please notify the sender and
    delete the message. Any views or opinions presented in this
    e-mail may solely be the views of the author and cannot be relied
    upon as being those of Dublin City University. E-mail
    communications such as this cannot be guaranteed to be
    virus-free, timely, secure or error-free and Dublin City
    University does not accept liability for any such matters or
    their cons equences. Please consider the environment before
    printing this e-mail."/

    *Séanadh Ríomhphoist*

    /"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún
    agus is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán
    ar scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an
    teachtaireacht seo agus ar aon cheangaltán atá ag dul leis. Má tá
    an ríomhphost seo faighte agat trí dhearmad cuir sin in iúl le do
    thoil don seoltóir agus scrios an teachtaireacht. D’fhéadfadh sé
    gurb iad tuairimí an údair agus sin amháin atá in aon tuairimí no
    dearcthaí atá curtha i láthair sa ríomhphost seo agus níor chóir
    glacadh leo mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile
    Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid ríomhphoist den
    sórt seo saor ó víreas, in am, slán, nó saor ó earráid agus ní
    ghlacann Ollscoil Chathair Bhaile Átha Cliath le dliteanas in aon
    chás den sórt sin ná as aon iarmhairt a d’eascródh astu.
    Cuimhnigh ar an timpeallacht le do thoil sula gcuireann tú an
    ríomhphost seo i gcló."/

    /

-- Hieu Hoang
    http://www.hoang.co.uk/hieu



/

*Email Disclaimer*

/"This e-mail and any files transmitted with it are confidential and are intended solely for use by the addressee. Any unauthorised dissemination, distribution or copying of this message and any attachments is strictly prohibited. If you have received this e-mail in error, please notify the sender and delete the message. Any views or opinions presented in this e-mail may solely be the views of the author and cannot be relied upon as being those of Dublin City University. E-mail communications such as this cannot be guaranteed to be virus-free, timely, secure or error-free and Dublin City University does not accept liability for any such matters or their consequences. Please consider the environment before printing this e-mail."/

*Séanadh Ríomhphoist*

/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún agus is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán ar scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an teachtaireacht seo agus ar aon cheangaltán atá ag dul leis. Má tá an ríomhphost seo faighte agat trí dhearmad cuir sin in iúl le do thoil don seoltóir agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad tuairimí an údair agus sin amháin atá in aon tuairimí no dearcthaí atá curtha i láthair sa ríomhphost seo agus níor chóir glacadh leo mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid ríomhphoist den sórt seo saor ó víreas, in am, slán, nó saor ó earráid agus ní ghlacann Ollscoil Chathair Bhaile Átha Cliath le dliteanas in aon chás den sórt sin ná as aon iarmhairt a d’eascródh astu. Cuimhnigh ar an timpeallacht le do thoil sula gcuireann tú an ríomhphost seo i gcló."/

/

--
Hieu Hoang
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to