Re: Parser Training HeadRules German

Andreas Niekler Fri, 21 Mar 2014 01:00:18 -0700

Hello,

do you think the examples are incorrect in general or just the bracket?
I recognized in the code that the brackets should get translated to
"LRB" auomatically. So i found no obvious mistake in my training data.
But i'm not sure if the -NONE- Tag is correct or if i can ommit unknown
POS Tags.


All the best

Andreas

Am 20.03.2014 17:48, schrieb Rodrigo Agerri:
> Have you tried correcting them? 
> 
> Cheers, 
> 
> Rodrigo
> 
> On 2014/03/20 at 16:34, Andreas Niekler wrote:
>> Hi,
>>
>> i converted the XML Tiger Corpus to the training format
>>
>> (TOP (S (NN Zugeständnisse) (VP (ADJD unzureichend) (VVPP genannt)
>> ))(-NONE- /) )
>> (TOP (-NONE- ``) (VP (NN Land) (PP (APPR auf) (NN Konfrontationskurs)
>> )(VVPP gesteuert) )(-NONE- '') (-NONE- /) )
>> (TOP (ADJA Harte) (NN Töne) (NP (ART der) (NN Regierung) )(PP (APPR
>> gegen) (NN Nationalkongreß) ))
>> (TOP (NE JOHANNESBURG) (, ,) (NP (ADJA 5.) (NN Juli) )(-NONE- () (CNP
>> (NE AP) (NE jod) )(-NONE- /) (-NONE- )) (. .) )
>>
>> I copied some HeadRules from the
>> corenlp/edu/stanford/nlp/trees/international/negra class.
>>
>> When i now run the trainer for the parster i get this error regarding
>> the puctuations:
>>
>> Building dictionary
>> Exception in thread "main" java.lang.NullPointerException
>>         at
>> opennlp.tools.parser.AbstractBottomUpParser.lastChild(AbstractBottomUpParser.java:502)
>>         at
>> opennlp.tools.parser.AbstractBottomUpParser.buildDictionary(AbstractBottomUpParser.java:552)
>>         at opennlp.tools.parser.chunking.Parser.train(Parser.java:287)
>>         at
>> opennlp.tools.cmdline.parser.ParserTrainerTool.run(ParserTrainerTool.java:132)
>>         at opennlp.tools.cmdline.CLI.main(CLI.java:222)
>>
>> Has this something to do with the rraining instances that have no end
>> marker? I also recognize this when there is a ( int the text: (-NONE- ()
>>
>> Would that be the error and do i have to replace those instances.
>>
>> Thank you
>>
>> Andreas
>>
>>
>>
>> Am 20.03.2014 11:52, schrieb Andreas Niekler:
>>> Hi,
>>>
>>> as i understand this my examples are binarized within the training
>>> process and i have to provide rules for binarized trees?
>>>
>>> All the best
>>>
>>> Andreas
>>>
>>> Am 19.03.2014 15:31, schrieb Rodrigo Agerri:
>>>> Hi Andreas, 
>>>>
>>>> This issue has already been discussed here, so I will summarize: 
>>>>
>>>> the english head rules come from Michael Collins thesis, check Annex A  
>>>>
>>>> http://www.dfki.de/~neumann/dop-seminar/References/collins-thesis.pdf
>>>>
>>>> I have recently posted about the head rules in Spanish (Ancora corpus)
>>>>
>>>> https://issues.apache.org/jira/browse/OPENNLP-665
>>>>
>>>> Also check the 7th of March thread about language specific headrules when
>>>> training parser 
>>>>
>>>> Finally, Stanford Parser provides headrules for the Negra corpus, which 
>>>> could
>>>> be useful for you. 
>>>>
>>>> corenlp/edu/stanford/nlp/trees/international/negra 
>>>>
>>>> Cheers, 
>>>>
>>>> Rodrigo
>>>>
>>>> On 2014/03/19 at 15:02, Andreas Niekler wrote:
>>>>> Hi all,
>>>>>
>>>>> i want to train a german parser model with the tiger corpus. For this
>>>>> reason i need some other HeadRules for the training process. In the
>>>>> moment i'm a bit stuck understanding what this rules are exactly for and
>>>>> if it would be ok if i just provide empty rules.
>>>>>
>>>>> Can somebody comment on this or give me a short intuition how those
>>>>> rules work or how do i have to interpret / understand them?
>>>>>
>>>>> Thank you
>>>>>
>>>>> Andreas
>>>>> -- 
>>>>> Andreas Niekler, Dipl. Ing. (FH)
>>>>> NLP Group | Department of Computer Science
>>>>> University of Leipzig
>>>>> Johannisgasse 26 | 04103 Leipzig
>>>>>
>>>>> mail: [email protected]
>>>>
>>>
>>
>> -- 
>> Andreas Niekler, Dipl. Ing. (FH)
>> NLP Group | Department of Computer Science
>> University of Leipzig
>> Johannisgasse 26 | 04103 Leipzig
>>
>> mail: [email protected]
> 

-- 
Andreas Niekler, Dipl. Ing. (FH)
NLP Group | Department of Computer Science
University of Leipzig
Johannisgasse 26 | 04103 Leipzig

mail: [email protected]

Re: Parser Training HeadRules German

Reply via email to