Have you tried correcting them? Cheers,
Rodrigo On 2014/03/20 at 16:34, Andreas Niekler wrote: > Hi, > > i converted the XML Tiger Corpus to the training format > > (TOP (S (NN Zugeständnisse) (VP (ADJD unzureichend) (VVPP genannt) > ))(-NONE- /) ) > (TOP (-NONE- ``) (VP (NN Land) (PP (APPR auf) (NN Konfrontationskurs) > )(VVPP gesteuert) )(-NONE- '') (-NONE- /) ) > (TOP (ADJA Harte) (NN Töne) (NP (ART der) (NN Regierung) )(PP (APPR > gegen) (NN Nationalkongreß) )) > (TOP (NE JOHANNESBURG) (, ,) (NP (ADJA 5.) (NN Juli) )(-NONE- () (CNP > (NE AP) (NE jod) )(-NONE- /) (-NONE- )) (. .) ) > > I copied some HeadRules from the > corenlp/edu/stanford/nlp/trees/international/negra class. > > When i now run the trainer for the parster i get this error regarding > the puctuations: > > Building dictionary > Exception in thread "main" java.lang.NullPointerException > at > opennlp.tools.parser.AbstractBottomUpParser.lastChild(AbstractBottomUpParser.java:502) > at > opennlp.tools.parser.AbstractBottomUpParser.buildDictionary(AbstractBottomUpParser.java:552) > at opennlp.tools.parser.chunking.Parser.train(Parser.java:287) > at > opennlp.tools.cmdline.parser.ParserTrainerTool.run(ParserTrainerTool.java:132) > at opennlp.tools.cmdline.CLI.main(CLI.java:222) > > Has this something to do with the rraining instances that have no end > marker? I also recognize this when there is a ( int the text: (-NONE- () > > Would that be the error and do i have to replace those instances. > > Thank you > > Andreas > > > > Am 20.03.2014 11:52, schrieb Andreas Niekler: > > Hi, > > > > as i understand this my examples are binarized within the training > > process and i have to provide rules for binarized trees? > > > > All the best > > > > Andreas > > > > Am 19.03.2014 15:31, schrieb Rodrigo Agerri: > >> Hi Andreas, > >> > >> This issue has already been discussed here, so I will summarize: > >> > >> the english head rules come from Michael Collins thesis, check Annex A > >> > >> http://www.dfki.de/~neumann/dop-seminar/References/collins-thesis.pdf > >> > >> I have recently posted about the head rules in Spanish (Ancora corpus) > >> > >> https://issues.apache.org/jira/browse/OPENNLP-665 > >> > >> Also check the 7th of March thread about language specific headrules when > >> training parser > >> > >> Finally, Stanford Parser provides headrules for the Negra corpus, which > >> could > >> be useful for you. > >> > >> corenlp/edu/stanford/nlp/trees/international/negra > >> > >> Cheers, > >> > >> Rodrigo > >> > >> On 2014/03/19 at 15:02, Andreas Niekler wrote: > >>> Hi all, > >>> > >>> i want to train a german parser model with the tiger corpus. For this > >>> reason i need some other HeadRules for the training process. In the > >>> moment i'm a bit stuck understanding what this rules are exactly for and > >>> if it would be ok if i just provide empty rules. > >>> > >>> Can somebody comment on this or give me a short intuition how those > >>> rules work or how do i have to interpret / understand them? > >>> > >>> Thank you > >>> > >>> Andreas > >>> -- > >>> Andreas Niekler, Dipl. Ing. (FH) > >>> NLP Group | Department of Computer Science > >>> University of Leipzig > >>> Johannisgasse 26 | 04103 Leipzig > >>> > >>> mail: [email protected] > >> > > > > -- > Andreas Niekler, Dipl. Ing. (FH) > NLP Group | Department of Computer Science > University of Leipzig > Johannisgasse 26 | 04103 Leipzig > > mail: [email protected]
