Hi Cristian,

Interesting ideas. Let me do some background reading on this, so I can also
participate in the discussion better.

Thanks,
Dileepa

On Wed, Sep 9, 2015 at 3:17 PM, Cristian Petroaca <
cristian.petro...@gmail.com> wrote:

> Another approach to this would be to use a semantic role labeling tool [1]
> to determine the type of relation between the subject and object.
>
> Or we could use Word Sense Disambiguation to determine the wordnet class of
> the verb (this way we have a standard relation definition) and based on
> what relation type it is we can search for the subject and object using
> dependency tree parsing in Stanford NLP.
>
> These 2 options ensure that we can have a much bigger recall but I'm not
> sure about the precision...
>
> So I think we'll need to first settle on the method of implementing this
> engine before starting anything.
>
> [1] http://cogcomp.cs.illinois.edu/page/demo_view/srl
>
> On Tue, Sep 8, 2015 at 11:45 AM, Cristian Petroaca <
> cristian.petro...@gmail.com> wrote:
>
> > Hi Dileepa,
> >
> > Unfortunately I did not have the time to work on this at all so there is
> > no code base . But I'd be happy to start contributing with something to
> > this engine and I think it would also be very helpful if you will be able
> > to contribute to this as well.
> > I did get a chance to test the Stanford relation extractor which works
> > fine but it's quite limited to a handful of relation types (live_in,
> > located_in, org_based_in, work_for). So we would need to train other
> models
> > if we want to increase the relation type number.
> > I also think that the Event Extraction Engine should work in conjunction
> > with any coreference and comention engines we have to increase the
> relation
> > count.
> >
> > Regards,
> > Cristian
> >
> > On Tue, Sep 8, 2015 at 11:19 AM, Dileepa Jayakody <
> > dileepajayak...@gmail.com> wrote:
> >
> >> Hi Cristian and all,
> >>
> >> Can I please know the status of this event extraction engine? Event
> >> extraction is a really useful feature for semantic enhancements and I am
> >> interested in collaborating with this work.
> >> Is there any code base you are currently working on for this engine
> work?
> >>
> >> Thanks,
> >> Dileepa
> >>
> >> On Tue, Feb 17, 2015 at 9:10 PM, Cristian Petroaca <
> >> cristian.petro...@gmail.com> wrote:
> >>
> >> > Hi Edi,
> >> >
> >> > Thanks for the info. Stanford Relation Extractor sounds very
> >> interesting.
> >> > I'll give it a try.
> >> >
> >> > 2015-02-17 17:00 GMT+02:00 Edi Bice <edi_b...@yahoo.com.invalid>:
> >> >
> >> > > Hi Cristian,
> >> > > Here are a few more resources on Semantic Role/Relationship
> Labeling:
> >> > > 1. FrameNet, VerbNet and WordNet on the data side2. Shalmaneser,
> >> SEMAFOR
> >> > > and Stanford Relation Extractor on the code side
> >> > > The last one links to a great paper which I believe holds great
> >> potential
> >> > > for Stanbol:
> >> > > A Linear Programming Formulation for Global Inference in Natural
> >> Language
> >> > > Tasks
> >> > >
> >> > > |   |
> >> > > |   |   |   |   |   |
> >> > > | A Linear Programming Formulation for Global Inference in Natural
> >> > > Language Tasks  Last abstract |Contents |Next abstract A Linear
> >> > Programming
> >> > > Formulation for Global Inference in Natural Language Tasks  |
> >> > > |  |
> >> > > | View on www.cnts.ua.ac.be | Preview by Yahoo |
> >> > > |  |
> >> > > |   |
> >> > >
> >> > >
> >> > >
> >> > > Edi
> >> > >       From: Cristian Petroaca <cristian.petro...@gmail.com>
> >> > >  To: dev@stanbol.apache.org
> >> > >  Sent: Sunday, February 15, 2015 6:34 AM
> >> > >  Subject: Event Extraction Engine
> >> > >
> >> > > Hi All,
> >> > >
> >> > > Quite a while ago I started a discussion on this list about Event
> >> > > Extraction from text. See
> >> > > https://issues.apache.org/jira/browse/STANBOL-1121
> >> > > .
> >> > >
> >> > > I'd like to get started on the actual work and I have been thinking
> >> how
> >> > to
> >> > > best approach this and there are some things that I would do
> >> differently
> >> > > than what the JIRA describes.I'd like to get your feedback on it.
> >> > >
> >> > > Basically the main approach would be:
> >> > >
> >> > > 1. Detect all NERs and their co-references.
> >> > >
> >> > > 2. Apply semantic role labeling on the sentences where the above
> >> > mentioned
> >> > > NERs reside.
> >> > > I found some interesting Semantic Role labeling libraries such as
> >> > > https://code.google.com/p/mate-tools/ or
> >> > > http://cogcomp.cs.illinois.edu/page/software_view/SRL.
> >> > > With this I'll be able to detect the Agent, the Verb (action) and
> the
> >> > > Patient and Instruments.
> >> > >
> >> > > This could be a minimal implementation of the engine. After that I
> can
> >> > > simply create the event data model as described in the JIRA and
> >> annotate
> >> > > the text.
> >> > > But this does not actually detect what kind of event it is or what
> are
> >> > the
> >> > > event specific roles that the entities have in the relation.
> >> > >
> >> > > For example we can have the sentence "Google buys Yahoo for $100
> >> > million".
> >> > > There are a lot more to be said about this sentence than simply that
> >> > > "Google" is the agent and "Yahoo" is the Patient. This is actually
> an
> >> > > acquisition event and "Google" is the buyer and "Yahoo" the bought
> >> > entity.
> >> > > We also would need to align to a common ontology synonym phrases
> such
> >> as
> >> > > "buy" or "acquire" so that we know that both refer to the same
> >> > Acquisition
> >> > > event.
> >> > >
> >> > > Having said that, we would add a new step :
> >> > > 3. Try to detect event type and event details.
> >> > >
> >> > > This can be done by either:
> >> > >
> >> > > 3.1 Rule based : hand written rules which would map a certain
> sentence
> >> > > structure, such as the name of the verb and the type of entities as
> >> > agent,
> >> > > patient to a certain event type.
> >> > > This has the benefit of being easy to build but quite inflexible.
> >> > >
> >> > > 3.2 Statistical based: train a model which would be able to classify
> >> an
> >> > > event type based on the features of the sentence such as verb type,
> >> > entity
> >> > > type, role type, etc.. This is the approach described here :
> >> > > http://web.stanford.edu/~jurafsky/mintz.pdf.
> >> > > This would be quite hard to build but quite flexible.
> >> > >
> >> > > This 3rd step of detecting event types & details I think would be
> most
> >> > > efficient for domain specific events. We would have configs with
> >> several
> >> > > models for several domains available and the user could with use one
> >> of
> >> > the
> >> > > pre-existent models or create a new one.
> >> > >
> >> > > I don't have any practical experience with training models or text
> >> > > classification based on features (but I've been doing a lot of
> >> reading on
> >> > > it) so I'm not sure exactly how feasible what I said at point no 3
> >> > actually
> >> > > is.
> >> > >
> >> > > Regards,
> >> > > Cristian
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Reply via email to