Hey guys, You may have already seen this, but I posted a really simple "getting started" Eclipse project on my GitHub profile for the coreference API: https://github.com/amb-enthusiast/CoreferenceTest
It is similar to what you have already explored, but may provide some validation. I recently experimented using NER results from Stanford CoreNLP in sentence parse objects. In cases where better named entity info is added to the sentence parse, coref performance improves. Hope this helps in some way. Ant ----- Reply message ----- From: "Jim - FooBar();" <[email protected]> To: <[email protected]> Subject: coref results seem weird Date: Sat, Mar 2, 2013 9:10 am see this old message by Jorn... check the next message message in the thread as well. at some point he posts detailed code...it might be easier to put together a dummy project in order to use the API which is more flexible... Jim http://mail-archives.apache.org/mod_mbox/opennlp-users/201112.mbox/%[email protected]%3E On 02/03/13 16:05, James Kosin wrote: > I'm trying to use the CLI. I have other code in another project that > loads the dictionary properly using the property file to specify the > information. Coreference does it differently... > > On 3/2/2013 6:32 AM, Jim - FooBar(); wrote: >> I should be able to help you...are you going through the cli or the >> API? I bet there is something wrong with the Wordnet directory you're >> passing...I had similar issues... >> If you're using the API let me know and I'll send you a code snippet >> that may help... >> >> Jim >> >> On 02/03/13 05:17, James Kosin wrote: >>> Jim, >>> >>> I can't seem to get past the NULL pointer exceptions when >>> Coreferencer is trying to load the dictionaries. So, this will be >>> much later now. I'm going to sleep and play tooth fairy. >>> >>> James >>> >>> >>> On 3/1/2013 8:18 AM, Jim - FooBar(); wrote: >>>> Like you, I'm using the latest WOrdnet and JWNL (1.4 RC_3 is on >>>> maven you don't need to build it from source).... >>>> Now that you've set up your end could you please perform a run on >>>> the standard example sentence? In addition could you try to add the >>>> named-entities to the parse-tree? >>>> If yes, please post your results here for comparison with mine? >>>> >>>> thanks a lot, >>>> >>>> Jim >>>> >>>> >>>> On 01/03/13 02:21, James Kosin wrote: >>>>> Hi Jim, >>>>> >>>>> What version of the JWNL and WordNet dictionaries are you using? >>>>> I never got much more than researching what it is used for, and >>>>> its importance to handling the task. >>>>> >>>>> I've just updated my end for the 3.1 WordNet dictionaries. But, >>>>> I'm also using 1.4_rc3 from sources to build JWNL. The extJWNL >>>>> seems to be more apt to handling more types of dictionaries >>>>> (supporting UTF-8 and others), and actually creating and modifying >>>>> them as well; which isn't needed when we are really only wanting >>>>> read usage. >>>>> >>>>> James >>>>> >>>>> On 2/28/2013 4:49 AM, Jim foo.bar wrote: >>>>>> Hi James, >>>>>> >>>>>> thanks for your reply and your comments but that is not quite >>>>>> what I asked...I've looked at all the web resources related to >>>>>> the opennlp coref component, otherwise I would never have gotten >>>>>> it to work! >>>>>> >>>>>> My problem is about the results it brings back, in particular I'd >>>>>> like to compare my produced discourse entities with someone >>>>>> else's on the same piece of text. Since I'm working on a >>>>>> language other than Java, that would confirm that my code is at >>>>>> least correct. On a secondary note, I'd like to see how to insert >>>>>> the named-entities into the parse tree before deploying the >>>>>> TrreBankLinker. I followed the instructions posted my Jorn >>>>>> sometime last year but I 'm not sure how the output should look >>>>>> like .That is why I posted what I'm getting...Can you see any >>>>>> 'person' named-entities in my DicourseEntities? >>>>>> >>>>>> More importantly, if you run the coref component on the standard >>>>>> example sentence (Pierre Vinken, ...) what do you get? Could you >>>>>> post the exact output? >>>>>> >>>>>> Whoever psoted this: >>>>>> http://blog.dpdearing.com/2012/11/making-coreference-resolution-with-opennlp-1-5-0-your-bitch/ >>>>>> did not try to insert any NEs into the parse tree. In addition, >>>>>> his output is slightly different than mine...I don't know if that >>>>>> is because of a newer version of JWNL.jar that I'm using or >>>>>> something else... >>>>>> >>>>>> Jim >>>>>> >>>>>> >>>>>> On 28/02/13 02:51, James Kosin wrote: >>>>>>> Jim, >>>>>>> >>>>>>> Here is a place to start, with maybe some more examples: >>>>>>> http://stackoverflow.com/questions/8629737/coreference-resolution-using-opennlp >>>>>>> >>>>>>> >>>>>>> >>>>>>> James >>>>>>> >>>>>>> On 2/27/2013 1:26 PM, Jim - FooBar(); wrote: >>>>>>>> Hmmm.... interesting! When I run it on these 2 simple sentences: >>>>>>>> >>>>>>>> /"Mary likes pizza but she also likes kebabs. Knowing her, I'd >>>>>>>> give it 2 weeks before she turns massive!"/ >>>>>>>> >>>>>>>> I get perfect results! >>>>>>>> >>>>>>>> #<DiscourseEntity [ Mary, she, her, she ]> >>>>>>>> >>>>>>>> this demonstrates 3 things: >>>>>>>> - my understanding of coref is indeed correct >>>>>>>> - the coref component can link entities from separate sentences >>>>>>>> - possibly that my code is fine >>>>>>>> >>>>>>>> any thoughts? >>>>>>>> >>>>>>>> Jim >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 27/02/13 18:14, Jim - FooBar(); wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I finally managed to get coref working (phew!-my god that was >>>>>>>>> tricky) but I'm slightly confused with the results so I'd like >>>>>>>>> to see if anyone else has tried that out...Using the standard >>>>>>>>> paragraph used in the other examples: >>>>>>>>> >>>>>>>>> /"Pierre Vinken, 61 years old, will join the board as a >>>>>>>>> nonexecutive director Nov. 29. Mr. Vinken is chairman of >>>>>>>>> Elsevier N.V., the Dutch publishing group. Rudolph Agnew, 55 >>>>>>>>> years old and former chairman of Consolidated Gold Fields PLC, >>>>>>>>> was named a director of this British industrial conglomerate."/ >>>>>>>>> >>>>>>>>> deploying the coref component gives me the following: >>>>>>>>> I must note that I'm trying to pass the named entities as well >>>>>>>>> (person). I've confirmed that the spans are correctly >>>>>>>>> identitified (3 spans for this particular example) and added >>>>>>>>> to the parse tree via >>>>>>>>> /opennlp.tools.parser.Parse.addNames//("person", span, >>>>>>>>> parse.getTagNodes());/ >>>>>>>>> >>>>>>>>> >>>>>>>>> [#<DiscourseEntity [ this British industrial conglomerate ]>, >>>>>>>>> #<DiscourseEntity [ a director of this British industrial >>>>>>>>> conglomerate ]>, >>>>>>>>> #<DiscourseEntity [ Consolidated Gold Fields PLC ]>, >>>>>>>>> #<DiscourseEntity [ chairman of Elsevier N . V . , the Dutch >>>>>>>>> publishing group, former chairman of Consolidated Gold Fields >>>>>>>>> PLC ]>, >>>>>>>>> #<DiscourseEntity [ 55 years ]>, >>>>>>>>> #<DiscourseEntity [ Rudolph Agnew , 55 years old and former >>>>>>>>> chairman of Consolidated Gold Fields PLC , was named a >>>>>>>>> director of this British industrial conglomerate . ]>, >>>>>>>>> #<DiscourseEntity [ Elsevier N . V . , the Dutch publishing >>>>>>>>> group, the Dutch publishing group ]>, >>>>>>>>> #<DiscourseEntity [ Mr . Vinken ]>, >>>>>>>>> #<DiscourseEntity [ a nonexecutive director Nov . 29 ]>, >>>>>>>>> #<DiscourseEntity [ the board ]>, >>>>>>>>> #<DiscourseEntity [ 61 years ]>, >>>>>>>>> #<DiscourseEntity [ Pierre Vinken , 61 years old ]> >>>>>>>>> ] >>>>>>>>> >>>>>>>>> *filtering for more than 1 mentions (per Jorn's suggestion) >>>>>>>>> gives back:* >>>>>>>>> >>>>>>>>> [#<DiscourseEntity [ chairman of Elsevier N . V . , the Dutch >>>>>>>>> publishing group, former chairman of Consolidated Gold Fields >>>>>>>>> PLC ]> >>>>>>>>> #<DiscourseEntity [ Elsevier N . V . , the Dutch publishing >>>>>>>>> group, the Dutch publishing group ]> >>>>>>>>> ] >>>>>>>>> >>>>>>>>> Assuming that this is what it's supposed to output, can >>>>>>>>> someone explain this? First of all where are the >>>>>>>>> named-entities? Secondly, out of the 2 filtered >>>>>>>>> DiscourseEntities, both seem plain wrong! Moreover, where is >>>>>>>>> #<DiscourseEntity [Rudolph Agnew, //former chairman of >>>>>>>>> Consolidated Gold Fields PLC/,/ the Dutch publishing group, >>>>>>>>> director of this British industrial conglomerate ]> ??? >>>>>>>>> >>>>>>>>> Either I'm not understanding coreference, or I've coded the >>>>>>>>> thing wrong or the models is not very good! Which one is it? >>>>>>>>> Has anyone else attempted this? Can we compare results on this >>>>>>>>> particular sentence? >>>>>>>>> >>>>>>>>> thanks in advance :) >>>>>>>>> >>>>>>>>> Jim >>>>>>>>> >>>>>>>>> ps: my code is in Clojure but it is based on a code snippet >>>>>>>>> provided by Jorn to someone on the mailing list last year . I >>>>>>>>> can easily provide it but I don't think it will be of much >>>>>>>>> help... >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> >
