Re: coref results seem weird

Jim - FooBar(); Wed, 06 Mar 2013 05:15:56 -0800

right, finally the moment of truth! here is what I get using your loop:


    Mention set:: [  this British industrial conglomerate   ]
    Mention set:: [  a director   ]
    Mention set:: [  Consolidated Gold Fields PLC   ]
    Mention set:: [  chairman  :: former chairman   ]
    Mention set:: [  55 years   ]
    Mention set:: [  Rudolph Agnew   ]
    Mention set:: [  Elsevier N . V .  :: the Dutch publishing group   ]
    Mention set:: [  Mr . Vinken   ]
    Mention set:: [  a nonexecutive director Nov . 29   ]
    Mention set:: [  the board   ]
    Mention set:: [  61 years   ]
    Mention set:: [  Pierre Vinken   ]

As you can see I am missing these 2 which do seem correct (in your output):

 Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]

Mention set:: [ a nonexecutive director :: chairman :: formerchairman :: a director ]

Jim

ps: I can confirm that the NEs can be retrieved correctly from theparse-tree just like yours...




On 06/03/13 12:19, Jim - FooBar(); wrote:

ok found your code and it answers all my questions!....I'll do thesame now and see what happens... :)
Jim


On 06/03/13 12:01, Jim - FooBar(); wrote:
I'm sorry I forgot another thing...how do you ask for the NEs fromthe Mention sets? I can only ask for the NEs from the Parse object...
Jim


On 06/03/13 11:58, Jim - FooBar(); wrote:
Hi there,
I apologise for the late reply but I've been a bit ill the past fewdays...
So I've got some good news and some bad news...let me explain:
here are my parses for each sentence (notice how they are identicalto yours -> GOOD news!):
-------------------------------------
(TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken)) (, ,)) (ADJP (NP(CD 61) (NNS years)) (JJ old))) (, ,) (VP (MD will) (VP (VB join)(NP (DT the) (NN board)) (PP (IN as) (NP (DT a) (JJ nonexecutive)(NN director) (NNP Nov) (NNP .) (CD 29))))) (. .)))
(TOP (S (NP (NNP Mr) (. .) (NNP Vinken)) (VP (VBZ is) (NP (NP (NNchairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N) (NNP .) (NNPV) (NNP .)) (, ,) (NP (DT the) (JJ Dutch) (NN publishing) (NNgroup)))))) (. .)))
(TOP (NP (person (NP (NNP Rudolph) (NNP Agnew)) (, ,)) (UCP (ADJP(NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)(NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNPFields) (NNP PLC)))) (, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP(DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJindustrial) (NN conglomerate))))))))) (. .)))
--------------------------------
Doing this right is more than half the story for thecoref-linker...Now, though there is a slight problem. What exactlyis a Mention set in your output? How come you're not getting anarray of DiscourseEntities back? In addition, are you filtering theresulting array for entities with size more than 1?
I have to say, your output does seem correct from a coreferenceresolution perspective...the problem is I can't understand why we'regetting different results...If you could explain what is aMentionSet that would be great...
thanks again,

Jim



On 04/03/13 05:29, Ant B wrote:
Hi Jim,
No problem - a good excuse to tidy the code. I added a fewprintln() calls to display input text, sentence parse objects,coreference mention sets and named entities in those sets. Notethat I only added "person" NERs to the sentence parse.
<start of code output>

Input sentences::
Pierre Vinken, 61 years old, will join the board as a nonexecutivedirector Nov. 29. Mr. Vinken is chairman of Elsevier N.V., theDutch publishing group. Rudolph Agnew, 55 years old and formerchairman of Consolidated Gold Fields PLC, was named a director ofthis British industrial conglomerate.
Sentence#1 parse after POS & NER tag:
(TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken))(, ,)) (ADJP (NP(CD 61) (NNS years)) (JJ old)))(, ,) (VP (MD will) (VP (VB join)(NP (DT the) (NN board)) (PP (IN as) (NP (NP (DT a) (JJnonexecutive) (NN director)) (NP (NNP Nov.) (CD 29))))))(. .)))
Sentence#2 parse after POS & NER tag:
(TOP (S (NP (NNP Mr.) (NNP Vinken)) (VP (VBZ is) (NP (NP (NNchairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N.V.))(, ,) (NP(DT the) (JJ Dutch) (NN publishing) (NN group))))))(. .)))
Sentence#3 parse after POS & NER tag:
(TOP (NP (person (NP (NNP Rudolph) (NNP Agnew))(, ,)) (UCP (ADJP(NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)(NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNPFields) (NNP PLC))))(, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP(DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJindustrial) (NN conglomerate)))))))))(. .)))
Now displaying all discourse entities::
    Mention set:: [ this British industrial conglomerate  ]
Mention set:: [ a nonexecutive director :: chairman :: formerchairman :: a director ]
    Mention set:: [ Consolidated Gold Fields PLC  ]
    Mention set:: [ 55 years  ]
    Mention set:: [ Rudolph Agnew  ]
    Mention set:: [ Elsevier N.V.  :: the Dutch publishing group  ]
    Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]
    Mention set:: [ Nov. 29  ]
    Mention set:: [ the board  ]
    Mention set:: [ 61 years  ]


Now printing out the named entities from mention sets::
    [Rudolph Agnew ]
    [Pierre Vinken ]

<end of code output>
I do not know for certain that my code is correct, so thanks forthe chance to compare data. I think this matches up with yourresults.
Let me know if you want to hack around any further - I'd reallylike a correct, reviewed & validated coreference example. I'm sureothers would benefit from such an example too.
Ant
On Mar 3, 2013, at 12:37 PM, Jim - FooBar(); <[email protected]>wrote:
do you still happen to have that project locally in your harddrive? is there any chance you could run this and post theresults? Alternatively, I'd have to clone the repo and give it aspin...I imagine it would be a lot easier for you as you are theauthor...
let me know if you can't do that...I noticed in your code that youalready use the sentence i am interested in which is good.... :)
Jim


On 03/03/13 03:29, [email protected] wrote:
Hey guys,
You may have already seen this, but I posted a really simple"getting started" Eclipse project on my GitHub profile for thecoreference API:
https://github.com/amb-enthusiast/CoreferenceTest
It is similar to what you have already explored, but may providesome validation.
I recently experimented using NER results from Stanford CoreNLPin sentence parse objects. In cases where better named entityinfo is added to the sentence parse, coref performance improves.
Hope this helps in some way.


Ant

----- Reply message -----
From: "Jim - FooBar();" <[email protected]>
To: <[email protected]>
Subject: coref results seem weird
Date: Sat, Mar 2, 2013 9:10 am
see this old message by Jorn... check the next message message inthe
thread as well. at some point he posts detailed code...it might be
easier to put together a dummy project in order to use the APIwhich is
more flexible...

Jim
http://mail-archives.apache.org/mod_mbox/opennlp-users/201112.mbox/%[email protected]%3E
On 02/03/13 16:05, James Kosin wrote:
I'm trying to use the CLI. I have other code in another projectthatloads the dictionary properly using the property file to specifythe
information.  Coreference does it differently...

On 3/2/2013 6:32 AM, Jim - FooBar(); wrote:
I should be able to help you...are you going through the cli ortheAPI? I bet there is something wrong with the Wordnet directoryyou're
passing...I had similar issues...
If you're using the API let me know and I'll send you a codesnippet
that may help...

Jim

On 02/03/13 05:17, James Kosin wrote:
Jim,

I can't seem to get past the NULL pointer exceptions when
Coreferencer is trying to load the dictionaries. So, this will be
much later now.  I'm going to sleep and play tooth fairy.

James


On 3/1/2013 8:18 AM, Jim - FooBar(); wrote:
Like you, I'm using the latest WOrdnet and JWNL (1.4 RC_3 is on
maven you don't need to build it from source)....
Now that you've set up your end could you please perform arun onthe standard example sentence? In addition could you try toadd the
named-entities to the parse-tree?
If yes, please post your results here for comparison with mine?

thanks a lot,

Jim


On 01/03/13 02:21, James Kosin wrote:
Hi Jim,
What version of the JWNL and WordNet dictionaries are youusing?
I never got much more than researching what it is used for, and
its importance to handling the task.

I've just updated my end for the 3.1 WordNet dictionaries. But,
I'm also using 1.4_rc3 from sources to build JWNL. The extJWNL
seems to be more apt to handling more types of dictionaries
(supporting UTF-8 and others), and actually creating andmodifyingthem as well; which isn't needed when we are really onlywanting
read usage.

James

On 2/28/2013 4:49 AM, Jim foo.bar wrote:
Hi James,

thanks for your reply and your comments but that is not quite
what I asked...I've looked at all the web resources related to
the opennlp coref component, otherwise I would never havegotten
it to work!
My problem is about the results it brings back, inparticular I'd
like to compare my produced discourse entities with someone
else's  on the same piece of text. Since I'm working on a
language other than Java, that would confirm that my codeis atleast correct. On a secondary note, I'd like to see how toinsert
the named-entities into the parse tree before deploying the
TrreBankLinker. I followed the instructions posted my Jorn
sometime last year but I 'm not sure how the output shouldlook
like .That is why I posted what I'm getting...Can you see any
'person' named-entities in my DicourseEntities?
More importantly, if you run the coref component on thestandardexample sentence (Pierre Vinken, ...) what do you get?Could you
post the exact output?

Whoever psoted this:
http://blog.dpdearing.com/2012/11/making-coreference-resolution-with-opennlp-1-5-0-your-bitch/did not try to insert any NEs into the parse tree. Inaddition,his output is slightly different than mine...I don't knowif that
is because of a newer version of JWNL.jar that I'm using or
something else...

Jim


On 28/02/13 02:51, James Kosin wrote:
Jim,

Here is a place to start, with maybe some more examples:
http://stackoverflow.com/questions/8629737/coreference-resolution-using-opennlp
James

On 2/27/2013 1:26 PM, Jim - FooBar(); wrote:
Hmmm.... interesting! When I run it on these 2 simplesentences:
/"Mary likes pizza but she also likes kebabs. Knowingher, I'd
give it 2 weeks before she turns massive!"/

I get perfect results!

#<DiscourseEntity [ Mary, she, her, she ]>

this demonstrates 3 things:
- my understanding of coref is indeed correct
- the coref component can link entities from separatesentences
- possibly that my code is fine

any thoughts?

Jim



On 27/02/13 18:14, Jim - FooBar(); wrote:
Hi all,
I finally managed to get coref working (phew!-my godthat wastricky) but I'm slightly confused with the results soI'd liketo see if anyone else has tried that out...Using thestandard
paragraph used in the other examples:

/"Pierre Vinken, 61 years old, will join the board as a
nonexecutive director Nov. 29. Mr. Vinken is chairman of
Elsevier N.V., the Dutch publishing group. RudolphAgnew, 55years old and former chairman of Consolidated GoldFields PLC,was named a director of this British industrialconglomerate."/
deploying the coref component gives me the following:
I must note that I'm trying to pass the named entitiesas well
(person). I've confirmed that the spans are correctly
identitified (3 spans for this particular example) andadded
to the parse tree via
/opennlp.tools.parser.Parse.addNames//("person", span,
parse.getTagNodes());/
[#<DiscourseEntity [ this British industrialconglomerate ]>,
  #<DiscourseEntity [ a director of this British industrial
conglomerate ]>,
  #<DiscourseEntity [ Consolidated Gold Fields PLC ]>,
#<DiscourseEntity [ chairman of Elsevier N . V . , theDutchpublishing group, former chairman of Consolidated GoldFields
PLC ]>,
  #<DiscourseEntity [ 55 years ]>,
#<DiscourseEntity [ Rudolph Agnew , 55 years old andformer
chairman of Consolidated Gold Fields PLC , was named a
director of this British industrial conglomerate . ]>,
#<DiscourseEntity [ Elsevier N . V . , the Dutchpublishing
group, the Dutch publishing group ]>,
  #<DiscourseEntity [ Mr . Vinken ]>,
  #<DiscourseEntity [ a nonexecutive director Nov . 29 ]>,
  #<DiscourseEntity [ the board ]>,
  #<DiscourseEntity [ 61 years ]>,
  #<DiscourseEntity [ Pierre Vinken , 61 years old ]>
]

*filtering for more than 1 mentions (per Jorn's suggestion)
gives back:*
[#<DiscourseEntity [ chairman of Elsevier N . V . , theDutchpublishing group, former chairman of Consolidated GoldFields
PLC ]>
#<DiscourseEntity [ Elsevier N . V . , the Dutchpublishing
group, the Dutch publishing group ]>
]

Assuming that this is what it's supposed to output, can
someone explain this? First of all where are the
named-entities? Secondly, out of the 2 filtered
DiscourseEntities, both seem plain wrong! Moreover,where is
#<DiscourseEntity [Rudolph Agnew, //former chairman of
Consolidated Gold Fields PLC/,/ the Dutch publishing group,
director of this British industrial conglomerate ]> ???

Either I'm not understanding coreference, or I've coded the
thing wrong or the models is not very good! Which one isit?Has anyone else attempted this? Can we compare resultson this
particular sentence?

thanks in advance :)

Jim

ps: my code is in Clojure but it is based on a code snippet
provided by Jorn to someone on the mailing list lastyear . I
can easily provide it but I don't think it will be of much
help...

Re: coref results seem weird

Reply via email to