I have used EMS, with these settings:

training-options = "-mgiza -mgiza-cpus 16 -sort-buffer-size 10G -sort-compress gzip -cores 16 -alt-direct-rule-score-2 -score-command score-stsg"

extract-settings = "--T2S --STSG --AllowUnary --MaxScope 1000 --MaxNodes 30 --MaxRuleDepth 7 --MaxRuleSize 7"

score-settings = " --GoodTuring --LowCountFeature --MinCountHierarchical 2"

decoder-settings = "-search-algorithm 7 -feature-overwrite 'TranslationModel0 table-limit=200' -threads 8"

Am I missing something?

Cheers, Annette

On 04/20/2016 12:20 PM, Philip Williams wrote:
Yes, that sounds like the problem. For tree-to-string, you should give the decoder the option -search-algorithm 7.
Phil

On 20 Apr 2016, at 10:29, Hieu Hoang <hieuho...@gmail.com <mailto:hieuho...@gmail.com>> wrote:

Ah, what was the exact command you used to do the extraction and the decoding? Can you also provide the moses.INI file you're using

You might have stumbled upon a stsg extraction algorithm. That will require telling the decoder that its stsg rather than scfg

Hieu Hoang
http://www.hoang.co.uk/hieu

On 20 Apr 2016 10:57 am, "Annette Rios" <ar...@ifi.uzh.ch <mailto:ar...@ifi.uzh.ch>> wrote:

    The training data is the right format, and the rule extraction
    works fine for most of it. There is a problem with this
    particular structure (coordinated preposition). The part of  the
    tree that is relevant looks like this:

    <tree label="S"><tree label="AQ">asumidos</tree><tree
    label="cag"><tree label="sp"><tree label="SP">con</tree><tree
    label="sn"><tree label="NP">&#xE1;frica</tree></tree></tree><tree
    label="conj"><tree label="CC">y</tree></tree><tree
    label="SP">por</tree><tree label="sn"><tree
    label="NP">&#xE1;frica</tree></tree></tree>

    for which these rules are extracted (among others):

    [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
    y]] [SP] [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
    0.0874939 0.69856 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 |||
    4 2 2 ||| |||

    [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
    y]] [SP] [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
    0.838272 0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2
    ||| |||

    [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
    y]] [SP] [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272
    0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||


    These rules give me the following error when reading the phrase
    table:

    Exception: moses/Phrase.cpp:214 in void
    Moses::Phrase::CreateFromString(Moses::FactorDirection, const
    std::vector<long unsigned int>&, const StringPiece&,
    Moses::Word**) threw util::Exception because `nextPos ==
    string::npos'.
    Incorrect formatting of non-terminal. Should have 2 non-terms,
    eg. [X][X]. Current string: [SP]

    Thanks for the help.

    On 04/20/2016 08:17 AM, Hieu Hoang wrote:

        your training data should be in a format that Moses
        understand, eg.
            <tree label="NP"> <tree label="DET"> the </tree> <tree
        label="NN"> cat </tree> </tree>
        Currently, if looks like the training data is whatever came
        out of the parser.

        The syntax tutorial has a bit more information
        http://www.statmt.org/moses/?n=Moses.SyntaxTutorial

        On 18/04/2016 14:07, Annette Rios wrote:

            Hi all

            I'm trying to build a tree-to-string system, and I get
            this error from
            moses_chart:

            Exception: moses/Phrase.cpp:214 in void
            Moses::Phrase::CreateFromString(Moses::FactorDirection, const
            std::vector<long unsigned int>&, const StringPiece&,
            Moses::Word**)
            threw util::Exception because `nextPos == string::npos'.
            Incorrect formatting of non-terminal. Should have 2
            non-terms, eg.
            [X][X]. Current string: [SP]

            The corresponding lines in the phrase table look like this:

            [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
            [conj [CC y]] [SP]
            [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
            0.0874939 0.69856
            0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||
            [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
            [conj [CC y]] [SP]
            [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
            0.838272 0.174988
            0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
            [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
            [conj [CC y]] [SP]
            [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272
            0.174988
            0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||


            extracted from this parse:

            4    asumidos    asumido    a    AQ
             gen=m|num=p|postype=qualificative|eagles=AQ0MPP   3    S
            _    _
            5    con    con    s    SP postype=preposition|eagles=SPS00 8
               sp    _    _
            6    áfrica    áfrica    n    NP
            postype=proper||eagles=NP00000  5
               sn    _    _
7 y y c CC postype=coordinating|eagles=CC 8 conj
               _    _
            8    por    por    s    SP postype=preposition|eagles=SPS00 4
               cag    _    _
            9    áfrica    áfrica    n    NP
            postype=proper||eagles=NP00000  8
               sn    _    _

            converted to xml with conll2mosesxml.py:

                        <tree label="S">
                          <tree label="AQ">asumidos</tree>
                          <tree label="cag">
                            <tree label="sp">
                              <tree label="SP">con</tree>
                              <tree label="sn">
                                <tree label="NP">&#xE1;frica</tree>
                              </tree>
                            </tree>
                            <tree label="conj">
                              <tree label="CC">y</tree>
                            </tree>
                            <tree label="SP">por</tree>
                            <tree label="sn">
                              <tree label="NP">&#xE1;frica</tree>
                            </tree>
                          </tree>


            Is there something wrong in my parse trees that causes this?

            Best regards

            Annette

            _______________________________________________
            Moses-support mailing list
            Moses-support@mit.edu <mailto:Moses-support@mit.edu>
            http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to