Re: Parser Training HeadRules German

Andreas Niekler Thu, 20 Mar 2014 08:35:46 -0700

Hi,

i converted the XML Tiger Corpus to the training format


(TOP (S (NN Zugeständnisse) (VP (ADJD unzureichend) (VVPP genannt)
))(-NONE- /) )
(TOP (-NONE- ``) (VP (NN Land) (PP (APPR auf) (NN Konfrontationskurs)
)(VVPP gesteuert) )(-NONE- '') (-NONE- /) )
(TOP (ADJA Harte) (NN Töne) (NP (ART der) (NN Regierung) )(PP (APPR
gegen) (NN Nationalkongreß) ))
(TOP (NE JOHANNESBURG) (, ,) (NP (ADJA 5.) (NN Juli) )(-NONE- () (CNP
(NE AP) (NE jod) )(-NONE- /) (-NONE- )) (. .) )

I copied some HeadRules from the
corenlp/edu/stanford/nlp/trees/international/negra class.

When i now run the trainer for the parster i get this error regarding
the puctuations:

Building dictionary
Exception in thread "main" java.lang.NullPointerException
        at
opennlp.tools.parser.AbstractBottomUpParser.lastChild(AbstractBottomUpParser.java:502)
        at
opennlp.tools.parser.AbstractBottomUpParser.buildDictionary(AbstractBottomUpParser.java:552)
        at opennlp.tools.parser.chunking.Parser.train(Parser.java:287)
        at
opennlp.tools.cmdline.parser.ParserTrainerTool.run(ParserTrainerTool.java:132)
        at opennlp.tools.cmdline.CLI.main(CLI.java:222)

Has this something to do with the rraining instances that have no end
marker? I also recognize this when there is a ( int the text: (-NONE- ()

Would that be the error and do i have to replace those instances.

Thank you

Andreas



Am 20.03.2014 11:52, schrieb Andreas Niekler:
> Hi,
> 
> as i understand this my examples are binarized within the training
> process and i have to provide rules for binarized trees?
> 
> All the best
> 
> Andreas
> 
> Am 19.03.2014 15:31, schrieb Rodrigo Agerri:
>> Hi Andreas, 
>>
>> This issue has already been discussed here, so I will summarize: 
>>
>> the english head rules come from Michael Collins thesis, check Annex A  
>>
>> http://www.dfki.de/~neumann/dop-seminar/References/collins-thesis.pdf
>>
>> I have recently posted about the head rules in Spanish (Ancora corpus)
>>
>> https://issues.apache.org/jira/browse/OPENNLP-665
>>
>> Also check the 7th of March thread about language specific headrules when
>> training parser 
>>
>> Finally, Stanford Parser provides headrules for the Negra corpus, which could
>> be useful for you. 
>>
>> corenlp/edu/stanford/nlp/trees/international/negra 
>>
>> Cheers, 
>>
>> Rodrigo
>>
>> On 2014/03/19 at 15:02, Andreas Niekler wrote:
>>> Hi all,
>>>
>>> i want to train a german parser model with the tiger corpus. For this
>>> reason i need some other HeadRules for the training process. In the
>>> moment i'm a bit stuck understanding what this rules are exactly for and
>>> if it would be ok if i just provide empty rules.
>>>
>>> Can somebody comment on this or give me a short intuition how those
>>> rules work or how do i have to interpret / understand them?
>>>
>>> Thank you
>>>
>>> Andreas
>>> -- 
>>> Andreas Niekler, Dipl. Ing. (FH)
>>> NLP Group | Department of Computer Science
>>> University of Leipzig
>>> Johannisgasse 26 | 04103 Leipzig
>>>
>>> mail: [email protected]
>>
> 

-- 
Andreas Niekler, Dipl. Ing. (FH)
NLP Group | Department of Computer Science
University of Leipzig
Johannisgasse 26 | 04103 Leipzig

mail: [email protected]

Re: Parser Training HeadRules German

Reply via email to