The training data is the right format, and the rule extraction works 
fine for most of it. There is a problem with this particular structure 
(coordinated preposition). The part of  the tree that is relevant looks 
like this:

<tree label="S"><tree label="AQ">asumidos</tree><tree label="cag"><tree 
label="sp"><tree label="SP">con</tree><tree label="sn"><tree 
label="NP">&#xE1;frica</tree></tree></tree><tree label="conj"><tree 
label="CC">y</tree></tree><tree label="SP">por</tree><tree 
label="sn"><tree label="NP">&#xE1;frica</tree></tree></tree>

for which these rules are extracted (among others):

[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] 
[sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856 
0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||

[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] 
[sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||

[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] 
[sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||


These rules give me the following error when reading the phrase table:

Exception: moses/Phrase.cpp:214 in void 
Moses::Phrase::CreateFromString(Moses::FactorDirection, const 
std::vector<long unsigned int>&, const StringPiece&, Moses::Word**) 
threw util::Exception because `nextPos == string::npos'.
Incorrect formatting of non-terminal. Should have 2 non-terms, eg. 
[X][X]. Current string: [SP]

Thanks for the help.

On 04/20/2016 08:17 AM, Hieu Hoang wrote:
> your training data should be in a format that Moses understand, eg.
>     <tree label="NP"> <tree label="DET"> the </tree> <tree label="NN"> 
> cat </tree> </tree>
> Currently, if looks like the training data is whatever came out of the 
> parser.
>
> The syntax tutorial has a bit more information
>    http://www.statmt.org/moses/?n=Moses.SyntaxTutorial
>
> On 18/04/2016 14:07, Annette Rios wrote:
>> Hi all
>>
>> I'm trying to build a tree-to-string system, and I get this error from
>> moses_chart:
>>
>> Exception: moses/Phrase.cpp:214 in void
>> Moses::Phrase::CreateFromString(Moses::FactorDirection, const
>> std::vector<long unsigned int>&, const StringPiece&, Moses::Word**)
>> threw util::Exception because `nextPos == string::npos'.
>> Incorrect formatting of non-terminal. Should have 2 non-terms, eg.
>> [X][X]. Current string: [SP]
>>
>> The corresponding lines in the phrase table look like this:
>>
>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>> [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856
>> 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||
>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>> [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988
>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>> [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988
>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>>
>>
>> extracted from this parse:
>>
>> 4    asumidos    asumido    a    AQ
>>    gen=m|num=p|postype=qualificative|eagles=AQ0MPP    3    S _    _
>> 5    con    con    s    SP    postype=preposition|eagles=SPS00 8
>>    sp    _    _
>> 6    áfrica    áfrica    n    NP postype=proper||eagles=NP00000  5
>>    sn    _    _
>> 7    y    y    c    CC    postype=coordinating|eagles=CC    8 conj
>>    _    _
>> 8    por    por    s    SP    postype=preposition|eagles=SPS00 4
>>    cag    _    _
>> 9    áfrica    áfrica    n    NP postype=proper||eagles=NP00000  8
>>    sn    _    _
>>
>> converted to xml with conll2mosesxml.py:
>>
>>             <tree label="S">
>>               <tree label="AQ">asumidos</tree>
>>               <tree label="cag">
>>                 <tree label="sp">
>>                   <tree label="SP">con</tree>
>>                   <tree label="sn">
>>                     <tree label="NP">&#xE1;frica</tree>
>>                   </tree>
>>                 </tree>
>>                 <tree label="conj">
>>                   <tree label="CC">y</tree>
>>                 </tree>
>>                 <tree label="SP">por</tree>
>>                 <tree label="sn">
>>                   <tree label="NP">&#xE1;frica</tree>
>>                 </tree>
>>               </tree>
>>
>>
>> Is there something wrong in my parse trees that causes this?
>>
>> Best regards
>>
>> Annette
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to