Re: Event Extraction Engine

Cristian Petroaca Tue, 17 Feb 2015 07:41:57 -0800

Hi Edi,

Thanks for the info. Stanford Relation Extractor sounds very interesting.
I'll give it a try.


2015-02-17 17:00 GMT+02:00 Edi Bice <[email protected]>:

> Hi Cristian,
> Here are a few more resources on Semantic Role/Relationship Labeling:
> 1. FrameNet, VerbNet and WordNet on the data side2. Shalmaneser, SEMAFOR
> and Stanford Relation Extractor on the code side
> The last one links to a great paper which I believe holds great potential
> for Stanbol:
> A Linear Programming Formulation for Global Inference in Natural Language
> Tasks
>
> |   |
> |   |   |   |   |   |
> | A Linear Programming Formulation for Global Inference in Natural
> Language Tasks  Last abstract |Contents |Next abstract A Linear Programming
> Formulation for Global Inference in Natural Language Tasks  |
> |  |
> | View on www.cnts.ua.ac.be | Preview by Yahoo |
> |  |
> |   |
>
>
>
> Edi
>       From: Cristian Petroaca <[email protected]>
>  To: [email protected]
>  Sent: Sunday, February 15, 2015 6:34 AM
>  Subject: Event Extraction Engine
>
> Hi All,
>
> Quite a while ago I started a discussion on this list about Event
> Extraction from text. See
> https://issues.apache.org/jira/browse/STANBOL-1121
> .
>
> I'd like to get started on the actual work and I have been thinking how to
> best approach this and there are some things that I would do differently
> than what the JIRA describes.I'd like to get your feedback on it.
>
> Basically the main approach would be:
>
> 1. Detect all NERs and their co-references.
>
> 2. Apply semantic role labeling on the sentences where the above mentioned
> NERs reside.
> I found some interesting Semantic Role labeling libraries such as
> https://code.google.com/p/mate-tools/ or
> http://cogcomp.cs.illinois.edu/page/software_view/SRL.
> With this I'll be able to detect the Agent, the Verb (action) and the
> Patient and Instruments.
>
> This could be a minimal implementation of the engine. After that I can
> simply create the event data model as described in the JIRA and annotate
> the text.
> But this does not actually detect what kind of event it is or what are the
> event specific roles that the entities have in the relation.
>
> For example we can have the sentence "Google buys Yahoo for $100 million".
> There are a lot more to be said about this sentence than simply that
> "Google" is the agent and "Yahoo" is the Patient. This is actually an
> acquisition event and "Google" is the buyer and "Yahoo" the bought entity.
> We also would need to align to a common ontology synonym phrases such as
> "buy" or "acquire" so that we know that both refer to the same Acquisition
> event.
>
> Having said that, we would add a new step :
> 3. Try to detect event type and event details.
>
> This can be done by either:
>
> 3.1 Rule based : hand written rules which would map a certain sentence
> structure, such as the name of the verb and the type of entities as agent,
> patient to a certain event type.
> This has the benefit of being easy to build but quite inflexible.
>
> 3.2 Statistical based: train a model which would be able to classify an
> event type based on the features of the sentence such as verb type, entity
> type, role type, etc.. This is the approach described here :
> http://web.stanford.edu/~jurafsky/mintz.pdf.
> This would be quite hard to build but quite flexible.
>
> This 3rd step of detecting event types & details I think would be most
> efficient for domain specific events. We would have configs with several
> models for several domains available and the user could with use one of the
> pre-existent models or create a new one.
>
> I don't have any practical experience with training models or text
> classification based on features (but I've been doing a lot of reading on
> it) so I'm not sure exactly how feasible what I said at point no 3 actually
> is.
>
> Regards,
> Cristian
>
>
>
>

Re: Event Extraction Engine

Reply via email to