Hi Dileepa,

Unfortunately I did not have the time to work on this at all so there is no
code base . But I'd be happy to start contributing with something to this
engine and I think it would also be very helpful if you will be able to
contribute to this as well.
I did get a chance to test the Stanford relation extractor which works fine
but it's quite limited to a handful of relation types (live_in, located_in,
org_based_in, work_for). So we would need to train other models if we want
to increase the relation type number.
I also think that the Event Extraction Engine should work in conjunction
with any coreference and comention engines we have to increase the relation
count.

Regards,
Cristian

On Tue, Sep 8, 2015 at 11:19 AM, Dileepa Jayakody <dileepajayak...@gmail.com
> wrote:

> Hi Cristian and all,
>
> Can I please know the status of this event extraction engine? Event
> extraction is a really useful feature for semantic enhancements and I am
> interested in collaborating with this work.
> Is there any code base you are currently working on for this engine work?
>
> Thanks,
> Dileepa
>
> On Tue, Feb 17, 2015 at 9:10 PM, Cristian Petroaca <
> cristian.petro...@gmail.com> wrote:
>
> > Hi Edi,
> >
> > Thanks for the info. Stanford Relation Extractor sounds very interesting.
> > I'll give it a try.
> >
> > 2015-02-17 17:00 GMT+02:00 Edi Bice <edi_b...@yahoo.com.invalid>:
> >
> > > Hi Cristian,
> > > Here are a few more resources on Semantic Role/Relationship Labeling:
> > > 1. FrameNet, VerbNet and WordNet on the data side2. Shalmaneser,
> SEMAFOR
> > > and Stanford Relation Extractor on the code side
> > > The last one links to a great paper which I believe holds great
> potential
> > > for Stanbol:
> > > A Linear Programming Formulation for Global Inference in Natural
> Language
> > > Tasks
> > >
> > > |   |
> > > |   |   |   |   |   |
> > > | A Linear Programming Formulation for Global Inference in Natural
> > > Language Tasks  Last abstract |Contents |Next abstract A Linear
> > Programming
> > > Formulation for Global Inference in Natural Language Tasks  |
> > > |  |
> > > | View on www.cnts.ua.ac.be | Preview by Yahoo |
> > > |  |
> > > |   |
> > >
> > >
> > >
> > > Edi
> > >       From: Cristian Petroaca <cristian.petro...@gmail.com>
> > >  To: dev@stanbol.apache.org
> > >  Sent: Sunday, February 15, 2015 6:34 AM
> > >  Subject: Event Extraction Engine
> > >
> > > Hi All,
> > >
> > > Quite a while ago I started a discussion on this list about Event
> > > Extraction from text. See
> > > https://issues.apache.org/jira/browse/STANBOL-1121
> > > .
> > >
> > > I'd like to get started on the actual work and I have been thinking how
> > to
> > > best approach this and there are some things that I would do
> differently
> > > than what the JIRA describes.I'd like to get your feedback on it.
> > >
> > > Basically the main approach would be:
> > >
> > > 1. Detect all NERs and their co-references.
> > >
> > > 2. Apply semantic role labeling on the sentences where the above
> > mentioned
> > > NERs reside.
> > > I found some interesting Semantic Role labeling libraries such as
> > > https://code.google.com/p/mate-tools/ or
> > > http://cogcomp.cs.illinois.edu/page/software_view/SRL.
> > > With this I'll be able to detect the Agent, the Verb (action) and the
> > > Patient and Instruments.
> > >
> > > This could be a minimal implementation of the engine. After that I can
> > > simply create the event data model as described in the JIRA and
> annotate
> > > the text.
> > > But this does not actually detect what kind of event it is or what are
> > the
> > > event specific roles that the entities have in the relation.
> > >
> > > For example we can have the sentence "Google buys Yahoo for $100
> > million".
> > > There are a lot more to be said about this sentence than simply that
> > > "Google" is the agent and "Yahoo" is the Patient. This is actually an
> > > acquisition event and "Google" is the buyer and "Yahoo" the bought
> > entity.
> > > We also would need to align to a common ontology synonym phrases such
> as
> > > "buy" or "acquire" so that we know that both refer to the same
> > Acquisition
> > > event.
> > >
> > > Having said that, we would add a new step :
> > > 3. Try to detect event type and event details.
> > >
> > > This can be done by either:
> > >
> > > 3.1 Rule based : hand written rules which would map a certain sentence
> > > structure, such as the name of the verb and the type of entities as
> > agent,
> > > patient to a certain event type.
> > > This has the benefit of being easy to build but quite inflexible.
> > >
> > > 3.2 Statistical based: train a model which would be able to classify an
> > > event type based on the features of the sentence such as verb type,
> > entity
> > > type, role type, etc.. This is the approach described here :
> > > http://web.stanford.edu/~jurafsky/mintz.pdf.
> > > This would be quite hard to build but quite flexible.
> > >
> > > This 3rd step of detecting event types & details I think would be most
> > > efficient for domain specific events. We would have configs with
> several
> > > models for several domains available and the user could with use one of
> > the
> > > pre-existent models or create a new one.
> > >
> > > I don't have any practical experience with training models or text
> > > classification based on features (but I've been doing a lot of reading
> on
> > > it) so I'm not sure exactly how feasible what I said at point no 3
> > actually
> > > is.
> > >
> > > Regards,
> > > Cristian
> > >
> > >
> > >
> > >
> >
>

Reply via email to