Jörn, Last summer I started looking at coref to see what could be done to update it. I ran into a problem that I summarized in an email that read:
In the new (and neat) Tool mechanism for 1.5, is there still a way to send parsed (tree) input to the NER module? Basically I'm trying to put together the pipeline to the Coref Tool, but I'm not sure of how to hook it up to both parsed and NER-marked output. Does the question make sense, and if so, does someone on the list know the answer? Thanks, jds On Thu, Dec 20, 2012 at 6:40 PM, Jörn Kottmann <[email protected]> wrote: > On 12/20/2012 06:26 PM, Rodrigo Agerri wrote: > >> Hi, >> >> Thanks! After checking the stack trace I added the jwnl to my >> classpath plus the path to WordNet to the maven arguments. It now runs >> and it outputs the parse tree with numbered mentions. Like this (note >> the NP#3) >> >> (SBAR (S (NP#3 (RB However) (NNS detectives)) (VP (VBD said) (SBAR (S >> (NP#3 (PRP they)) (VP (VBD had) (RB not) (VP (VBN found) (NP (DT any) >> (NN proof)) (SBAR (IN that) (S (NP (NP (DT the) (NN 35-year-old,)) >> (SBAR (WHNP (WP who)) (S (VP (VBD went) (S (VP (VBG missing) (PP (IN >> on) (NP (CD 18) (NNP March,))))))))) (VP (VBD was) (VP (VBN dead.) >> >> Is it possible to get something easier to the eye on the CLI? >> Is it possible to insert NEs to the parse tree on the CLI? (I guess not >> :) ) >> >> > For visualizations you might want to check out brat. They have a > javascript visualizer which > could display that nicely. > > Maybe we should make a tool which can format a penn treebank style parse > into a formated string with > mulitple lines. > > No, not really, sorry. The whole coref component needs some work so its as > easy to use as the other > components in OpenNLP. You are very welcome to help us with that. > Do you have a data set you would like to train it on? I tried to train it > on the muc data but still had some issues > to reach the performance of the old models (the full training code was > never published by the original author > and I just tried to write my own). > > Jörn >
