Re: Relation extraction feature

Fabian Christ Fri, 19 Jul 2013 07:33:13 -0700

Hi Cristian,

since you are not (yet) a committer for Stanbol, you can not be assigned to
Jira issues.


But you are very welcome to work on that issue and provide a patch that you
can upload and attach to the issue. A committer will then have a look at
the patch and apply it to the code base if it is okay.

Best,
- Fabian
Am 19.07.2013 13:35 schrieb "Cristian Petroaca" <[email protected]
>:

> It seems I don't have the possibillity to change the Jiras I created to
> assign them to me and to put them in progress. Can you help me with that?
> I started with this one :
> https://issues.apache.org/jira/browse/STANBOL-1132
>
> 2013/7/9 Cristian Petroaca <[email protected]>
>
> > Thanks. I'll let you know.
> >
> > Cristian
> >
> >
> > 2013/7/5 Rupert Westenthaler <[email protected]>
> >
> >> Hi Cristian,
> >>
> >> I created the branch at
> >>
> >>
> >>
> http://svn.apache.org/repos/asf/stanbol/branches/nlp-dep-tree-and-co-ref/
> >>
> >> ATM in contains only the "nlp" and "nlp-json" module. Let me know if
> >> you would like to have more
> >>
> >> best
> >> Rupert
> >>
> >>
> >>
> >> On Thu, Jul 4, 2013 at 10:14 AM, Cristian Petroaca
> >> <[email protected]> wrote:
> >> > Hi Rupert,
> >> >
> >> > I created jiras :
> https://issues.apache.org/jira/browse/STANBOL-1132and
> >> > https://issues.apache.org/jira/browse/STANBOL-1133. The original one
> in
> >> > dependent upon these.
> >> > Please let me know when I can start using the branch.
> >> >
> >> > Thanks,
> >> > Cristian
> >> >
> >> >
> >> > 2013/6/27 Cristian Petroaca <[email protected]>
> >> >
> >> >>
> >> >>
> >> >>
> >> >> 2013/6/27 Rupert Westenthaler <[email protected]>
> >> >>
> >> >>> On Thu, Jun 27, 2013 at 3:12 PM, Cristian Petroaca
> >> >>> <[email protected]> wrote:
> >> >>> > Sorry, I meant the Stanbol NLP API, not Stanford in my previous
> >> e-mail.
> >> >>> By
> >> >>> > the way, does Open NLP have the ability to build dependency trees?
> >> >>> >
> >> >>>
> >> >>> AFAIK OpenNLP does not provide this feature.
> >> >>>
> >> >>
> >> >> Then , since the Stanford NLP lib is also integrated into Stanbol,
> I'll
> >> >> take a look at how I can extend its integration to include the
> >> dependency
> >> >> tree feature.
> >> >>
> >> >>>
> >> >>>
> >> >>  >
> >> >>> > 2013/6/23 Cristian Petroaca <[email protected]>
> >> >>> >
> >> >>> >> Hi Rupert,
> >> >>> >>
> >> >>> >> I created jira
> https://issues.apache.org/jira/browse/STANBOL-1121.
> >> >>> >> As you suggested I would start with extending the Stanford NLP
> with
> >> >>> >> co-reference resolution but I think also with dependency trees
> >> because
> >> >>> I
> >> >>> >> also need to know the Subject of the sentence and the object that
> >> it
> >> >>> >> affects, right?
> >> >>> >>
> >> >>> >> Given that I need to extend the Stanford NLP API in Stanbol for
> >> >>> >> co-reference and dependency trees, how do I proceed with this?
> Do I
> >> >>> create
> >> >>> >> 2 new sub-tasks to the already opened Jira? After that can I
> start
> >> >>> >> implementing on my local copy of Stanbol and when I'm done I'll
> >> send
> >> >>> you
> >> >>> >> guys the patch fo review?
> >> >>> >>
> >> >>>
> >> >>> I would create two "New Feature" type Issues one for adding support
> >> >>> for "dependency trees" and the other for "co-reference" support. You
> >> >>> should also define "depends on" relations between STANBOL-1121 and
> >> >>> those two new issues.
> >> >>>
> >> >>> Sub-task could also work, but as adding those features would be also
> >> >>> interesting for other things I would rather define them as separate
> >> >>> issues.
> >> >>>
> >> >>>
> >> >> 2 New Features connected with the original jira it is then.
> >> >>
> >> >>
> >> >>> If you would prefer to work in an own branch please tell me. This
> >> >>> could have the advantage that patches would not be affected by
> changes
> >> >>> in the trunk.
> >> >>>
> >> >>> Yes, a separate branch sounds good.
> >> >>
> >> >> best
> >> >>> Rupert
> >> >>>
> >> >>> >> Regards,
> >> >>> >> Cristian
> >> >>> >>
> >> >>> >>
> >> >>> >> 2013/6/18 Rupert Westenthaler <[email protected]>
> >> >>> >>
> >> >>> >>> On Mon, Jun 17, 2013 at 10:18 PM, Cristian Petroaca
> >> >>> >>> <[email protected]> wrote:
> >> >>> >>> > Hi Rupert,
> >> >>> >>> >
> >> >>> >>> > Agreed on the
> >> >>> SettingAnnotation/ParticipantAnnotation/OccurentAnnotation
> >> >>> >>> > data structure.
> >> >>> >>> >
> >> >>> >>> > Should I open up a Jira for all of this in order to
> encapsulate
> >> this
> >> >>> >>> > information and establish the goals and these initial steps
> >> towards
> >> >>> >>> these
> >> >>> >>> > goals?
> >> >>> >>>
> >> >>> >>> Yes please. A JIRA issue for this work would be great.
> >> >>> >>>
> >> >>> >>> > How should I proceed further? Should I create some design
> >> documents
> >> >>> that
> >> >>> >>> > need to be reviewed?
> >> >>> >>>
> >> >>> >>> Usually it is the best to write design related text directly in
> >> JIRA
> >> >>> >>> by using Markdown [1] syntax. This will allow us later to use
> this
> >> >>> >>> text directly for the documentation on the Stanbol Webpage.
> >> >>> >>>
> >> >>> >>> best
> >> >>> >>> Rupert
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> [1] http://daringfireball.net/projects/markdown/
> >> >>> >>> >
> >> >>> >>> > Regards,
> >> >>> >>> > Cristian
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > 2013/6/17 Rupert Westenthaler <[email protected]>
> >> >>> >>> >
> >> >>> >>> >> On Thu, Jun 13, 2013 at 8:22 PM, Cristian Petroaca
> >> >>> >>> >> <[email protected]> wrote:
> >> >>> >>> >> > HI Rupert,
> >> >>> >>> >> >
> >> >>> >>> >> > First of all thanks for the detailed suggestions.
> >> >>> >>> >> >
> >> >>> >>> >> > 2013/6/12 Rupert Westenthaler <
> [email protected]
> >> >
> >> >>> >>> >> >
> >> >>> >>> >> >> Hi Cristian, all
> >> >>> >>> >> >>
> >> >>> >>> >> >> really interesting use case!
> >> >>> >>> >> >>
> >> >>> >>> >> >> In this mail I will try to give some suggestions on how
> this
> >> >>> could
> >> >>> >>> >> >> work out. This suggestions are mainly based on experiences
> >> and
> >> >>> >>> lessons
> >> >>> >>> >> >> learned in the LIVE [2] project where we built an
> >> information
> >> >>> system
> >> >>> >>> >> >> for the Olympic Games in Peking. While this Project
> >> excluded the
> >> >>> >>> >> >> extraction of Events from unstructured text (because the
> >> Olympic
> >> >>> >>> >> >> Information System was already providing event data as XML
> >> >>> messages)
> >> >>> >>> >> >> the semantic search capabilities of this system where very
> >> >>> similar
> >> >>> >>> as
> >> >>> >>> >> >> the one described by your use case.
> >> >>> >>> >> >>
> >> >>> >>> >> >> IMHO you are not only trying to extract relations, but a
> >> formal
> >> >>> >>> >> >> representation of the situation described by the text. So
> >> lets
> >> >>> >>> assume
> >> >>> >>> >> >> that the goal is to Annotate a Setting (or Situation)
> >> described
> >> >>> in
> >> >>> >>> the
> >> >>> >>> >> >> text - a fise:SettingAnnotation.
> >> >>> >>> >> >>
> >> >>> >>> >> >> The DOLCE foundational ontology [1] gives some advices on
> >> how to
> >> >>> >>> model
> >> >>> >>> >> >> those. The important relation for modeling this
> >> Participation:
> >> >>> >>> >> >>
> >> >>> >>> >> >>     PC(x, y, t) → (ED(x) ∧ PD(y) ∧ T(t))
> >> >>> >>> >> >>
> >> >>> >>> >> >> where ..
> >> >>> >>> >> >>
> >> >>> >>> >> >>  * ED are Endurants (continuants): Endurants do have an
> >> >>> identity so
> >> >>> >>> we
> >> >>> >>> >> >> would typically refer to them as Entities referenced by a
> >> >>> setting.
> >> >>> >>> >> >> Note that this includes physical, non-physical as well as
> >> >>> >>> >> >> social-objects.
> >> >>> >>> >> >>  * PD are Perdurants (occurrents):  Perdurants are
> entities
> >> that
> >> >>> >>> >> >> happen in time. This refers to Events, Activities ...
> >> >>> >>> >> >>  * PC are Participation: It is an time indexed relation
> >> where
> >> >>> >>> >> >> Endurants participate in Perdurants
> >> >>> >>> >> >>
> >> >>> >>> >> >> Modeling this in RDF requires to define some intermediate
> >> >>> resources
> >> >>> >>> >> >> because RDF does not allow for n-ary relations.
> >> >>> >>> >> >>
> >> >>> >>> >> >>  * fise:SettingAnnotation: It is really handy to define
> one
> >> >>> resource
> >> >>> >>> >> >> being the context for all described data. I would call
> this
> >> >>> >>> >> >> "fise:SettingAnnotation" and define it as a sub-concept to
> >> >>> >>> >> >> fise:Enhancement. All further enhancement about the
> >> extracted
> >> >>> >>> Setting
> >> >>> >>> >> >> would define a "fise:in-setting" relation to it.
> >> >>> >>> >> >>
> >> >>> >>> >> >>  * fise:ParticipantAnnotation: Is used to annotate that
> >> >>> Endurant is
> >> >>> >>> >> >> participating on a setting (fise:in-setting
> >> >>> fise:SettingAnnotation).
> >> >>> >>> >> >> The Endurant itself is described by existing
> >> fise:TextAnnotaion
> >> >>> (the
> >> >>> >>> >> >> mentions) and fise:EntityAnnotation (suggested Entities).
> >> >>> Basically
> >> >>> >>> >> >> the fise:ParticipantAnnotation will allow an
> >> EnhancementEngine
> >> >>> to
> >> >>> >>> >> >> state that several mentions (in possible different
> >> sentences) do
> >> >>> >>> >> >> represent the same Endurant as participating in the
> >> Setting. In
> >> >>> >>> >> >> addition it would be possible to use the dc:type property
> >> >>> (similar
> >> >>> >>> as
> >> >>> >>> >> >> for fise:TextAnnotation) to refer to the role(s) of an
> >> >>> participant
> >> >>> >>> >> >> (e.g. the set: Agent (intensionally performs an action)
> >> Cause
> >> >>> >>> >> >> (unintentionally e.g. a mud slide), Patient (a passive
> role
> >> in
> >> >>> an
> >> >>> >>> >> >> activity) and Instrument (aids an process)), but I am
> >> wondering
> >> >>> if
> >> >>> >>> one
> >> >>> >>> >> >> could extract those information.
> >> >>> >>> >> >>
> >> >>> >>> >> >> * fise:OccurrentAnnotation: is used to annotate a
> Perdurant
> >> in
> >> >>> the
> >> >>> >>> >> >> context of the Setting. Also fise:OccurrentAnnotation can
> >> link
> >> >>> to
> >> >>> >>> >> >> fise:TextAnnotaion (typically verbs in the text defining
> the
> >> >>> >>> >> >> perdurant) as well as fise:EntityAnnotation suggesting
> well
> >> >>> known
> >> >>> >>> >> >> Events in a knowledge base (e.g. a Election in a country,
> >> or an
> >> >>> >>> >> >> upraising ...). In addition fise:OccurrentAnnotation can
> >> define
> >> >>> >>> >> >> dc:has-participant links to fise:ParticipantAnnotation. In
> >> this
> >> >>> case
> >> >>> >>> >> >> it is explicitly stated hat an Endurant (the
> >> >>> >>> >> >> fise:ParticipantAnnotation) involved in this Perturant
> (the
> >> >>> >>> >> >> fise:OccurrentAnnotation). As Occurrences are temporal
> >> indexed
> >> >>> this
> >> >>> >>> >> >> annotation should also support properties for defining the
> >> >>> >>> >> >> xsd:dateTime for the start/end.
> >> >>> >>> >> >>
> >> >>> >>> >> >>
> >> >>> >>> >> >> Indeed, an event based data structure makes a lot of sense
> >> with
> >> >>> the
> >> >>> >>> >> remark
> >> >>> >>> >> > that you probably won't be able to always extract the date
> >> for a
> >> >>> >>> given
> >> >>> >>> >> > setting(situation).
> >> >>> >>> >> > There are 2 thing which are unclear though.
> >> >>> >>> >> >
> >> >>> >>> >> > 1. Perdurant : You could have situations in which the
> object
> >> upon
> >> >>> >>> which
> >> >>> >>> >> the
> >> >>> >>> >> > Subject ( or Endurant ) is acting is not a transitory
> object
> >> (
> >> >>> such
> >> >>> >>> as an
> >> >>> >>> >> > event, activity ) but rather another Endurant. For example
> >> we can
> >> >>> >>> have
> >> >>> >>> >> the
> >> >>> >>> >> > phrase "USA invades Irak" where "USA" is the Endurant (
> >> Subject )
> >> >>> >>> which
> >> >>> >>> >> > performs the action of "invading" on another Eundurant,
> >> namely
> >> >>> >>> "Irak".
> >> >>> >>> >> >
> >> >>> >>> >>
> >> >>> >>> >> By using CAOS, USA would be the Agent and Iraq the Patient.
> >> Both
> >> >>> are
> >> >>> >>> >> Endurants. The activity "invading" would be the Perdurant. So
> >> >>> ideally
> >> >>> >>> >> you would have a  "fise:SettingAnnotation" with:
> >> >>> >>> >>
> >> >>> >>> >>   * fise:ParticipantAnnotation for USA with the dc:type
> >> caos:Agent,
> >> >>> >>> >> linking to a fise:TextAnnotation for "USA" and a
> >> >>> fise:EntityAnnotation
> >> >>> >>> >> linking to dbpedia:United_States
> >> >>> >>> >>   * fise:ParticipantAnnotation for Iraq with the dc:type
> >> >>> caos:Patient,
> >> >>> >>> >> linking to a fise:TextAnnotation for "Irak" and a
> >> >>> >>> >> fise:EntityAnnotation linking to  dbpedia:Iraq
> >> >>> >>> >>   * fise:OccurrentAnnotation for "invades" with the dc:type
> >> >>> >>> >> caos:Activity, linking to a fise:TextAnnotation for "invades"
> >> >>> >>> >>
> >> >>> >>> >> > 2. Where does the verb, which links the Subject and the
> >> Object
> >> >>> come
> >> >>> >>> into
> >> >>> >>> >> > this? I imagined that the Endurant would have a
> dc:"property"
> >> >>> where
> >> >>> >>> the
> >> >>> >>> >> > property = verb which links to the Object in noun form. For
> >> >>> example
> >> >>> >>> take
> >> >>> >>> >> > again the sentence "USA invades Irak". You would have the
> >> "USA"
> >> >>> >>> Entity
> >> >>> >>> >> with
> >> >>> >>> >> > dc:invader which points to the Object "Irak". The Endurant
> >> would
> >> >>> >>> have as
> >> >>> >>> >> > many dc:"property" elements as there are verbs which link
> it
> >> to
> >> >>> an
> >> >>> >>> >> Object.
> >> >>> >>> >>
> >> >>> >>> >> As explained above you would have a fise:OccurrentAnnotation
> >> that
> >> >>> >>> >> represents the Perdurant. The information that the activity
> >> >>> mention in
> >> >>> >>> >> the text is "invades" would be by linking to a
> >> >>> fise:TextAnnotation. If
> >> >>> >>> >> you can also provide an Ontology for Tasks that defines
> >> >>> >>> >> "myTasks:invade" the fise:OccurrentAnnotation could also link
> >> to an
> >> >>> >>> >> fise:EntityAnnotation for this concept.
> >> >>> >>> >>
> >> >>> >>> >> best
> >> >>> >>> >> Rupert
> >> >>> >>> >>
> >> >>> >>> >> >
> >> >>> >>> >> > ### Consuming the data:
> >> >>> >>> >> >>
> >> >>> >>> >> >> I think this model should be sufficient for use-cases as
> >> >>> described
> >> >>> >>> by
> >> >>> >>> >> you.
> >> >>> >>> >> >>
> >> >>> >>> >> >> Users would be able to consume data on the setting level.
> >> This
> >> >>> can
> >> >>> >>> be
> >> >>> >>> >> >> done my simple retrieving all fise:ParticipantAnnotation
> as
> >> >>> well as
> >> >>> >>> >> >> fise:OccurrentAnnotation linked with a setting. BTW this
> >> was the
> >> >>> >>> >> >> approach used in LIVE [2] for semantic search. It allows
> >> >>> queries for
> >> >>> >>> >> >> Settings that involve specific Entities e.g. you could
> >> filter
> >> >>> for
> >> >>> >>> >> >> Settings that involve a {Person}, activities:Arrested and
> a
> >> >>> specific
> >> >>> >>> >> >> {Upraising}. However note that with this approach you will
> >> get
> >> >>> >>> results
> >> >>> >>> >> >> for Setting where the {Person} participated and an other
> >> person
> >> >>> was
> >> >>> >>> >> >> arrested.
> >> >>> >>> >> >>
> >> >>> >>> >> >> An other possibility would be to process enhancement
> >> results on
> >> >>> the
> >> >>> >>> >> >> fise:OccurrentAnnotation. This would allow to a much
> higher
> >> >>> >>> >> >> granularity level (e.g. it would allow to correctly answer
> >> the
> >> >>> query
> >> >>> >>> >> >> used as an example above). But I am wondering if the
> >> quality of
> >> >>> the
> >> >>> >>> >> >> Setting extraction will be sufficient for this. I have
> also
> >> >>> doubts
> >> >>> >>> if
> >> >>> >>> >> >> this can be still realized by using semantic indexing to
> >> Apache
> >> >>> Solr
> >> >>> >>> >> >> or if it would be better/necessary to store results in a
> >> >>> TripleStore
> >> >>> >>> >> >> and using SPARQL for retrieval.
> >> >>> >>> >> >>
> >> >>> >>> >> >> The methodology and query language used by YAGO [3] is
> also
> >> very
> >> >>> >>> >> >> relevant for this (especially note chapter 7 SPOTL(X)
> >> >>> >>> Representation).
> >> >>> >>> >> >>
> >> >>> >>> >> >> An other related Topic is the enrichment of Entities
> >> (especially
> >> >>> >>> >> >> Events) in knowledge bases based on Settings extracted
> form
> >> >>> >>> Documents.
> >> >>> >>> >> >> As per definition - in DOLCE - Perdurants are temporal
> >> indexed.
> >> >>> That
> >> >>> >>> >> >> means that at the time when added to a knowledge base they
> >> might
> >> >>> >>> still
> >> >>> >>> >> >> be in process. So the creation, enriching and refinement
> of
> >> such
> >> >>> >>> >> >> Entities in a the knowledge base seams to be critical for
> a
> >> >>> System
> >> >>> >>> >> >> like described in your use-case.
> >> >>> >>> >> >>
> >> >>> >>> >> >> On Tue, Jun 11, 2013 at 9:09 PM, Cristian Petroaca
> >> >>> >>> >> >> <[email protected]> wrote:
> >> >>> >>> >> >> >
> >> >>> >>> >> >> > First of all I have to mention that I am new in the
> field
> >> of
> >> >>> >>> semantic
> >> >>> >>> >> >> > technologies, I've started to read about them in the
> last
> >> 4-5
> >> >>> >>> >> >> months.Having
> >> >>> >>> >> >> > said that I have a high level overview of what is a good
> >> >>> approach
> >> >>> >>> to
> >> >>> >>> >> >> solve
> >> >>> >>> >> >> > this problem. There are a number of papers on the
> internet
> >> >>> which
> >> >>> >>> >> describe
> >> >>> >>> >> >> > what steps need to be taken such as : named entity
> >> >>> recognition,
> >> >>> >>> >> >> > co-reference resolution, pos tagging and others.
> >> >>> >>> >> >>
> >> >>> >>> >> >> The Stanbol NLP processing module currently only supports
> >> >>> sentence
> >> >>> >>> >> >> detection, tokenization, POS tagging, Chunking, NER and
> >> lemma.
> >> >>> >>> support
> >> >>> >>> >> >> for co-reference resolution and dependency trees is
> >> currently
> >> >>> >>> missing.
> >> >>> >>> >> >>
> >> >>> >>> >> >> Stanford NLP is already integrated with Stanbol [4]. At
> the
> >> >>> moment
> >> >>> >>> it
> >> >>> >>> >> >> only supports English, but I do already work to include
> the
> >> >>> other
> >> >>> >>> >> >> supported languages. Other NLP framework that is already
> >> >>> integrated
> >> >>> >>> >> >> with Stanbol are Freeling [5] and Talismane [6]. But note
> >> that
> >> >>> for
> >> >>> >>> all
> >> >>> >>> >> >> those the integration excludes support for co-reference
> and
> >> >>> >>> dependency
> >> >>> >>> >> >> trees.
> >> >>> >>> >> >>
> >> >>> >>> >> >> Anyways I am confident that one can implement a first
> >> prototype
> >> >>> by
> >> >>> >>> >> >> only using Sentences and POS tags and - if available -
> >> Chunks
> >> >>> (e.g.
> >> >>> >>> >> >> Noun phrases).
> >> >>> >>> >> >>
> >> >>> >>> >> >>
> >> >>> >>> >> > I assume that in the Stanbol context, a feature like
> Relation
> >> >>> >>> extraction
> >> >>> >>> >> > would be implemented as an EnhancementEngine?
> >> >>> >>> >> > What kind of effort would be required for a co-reference
> >> >>> resolution
> >> >>> >>> tool
> >> >>> >>> >> > integration into Stanbol?
> >> >>> >>> >> >
> >> >>> >>> >>
> >> >>> >>> >> Yes in the end it would be an EnhancementEngine. But before
> we
> >> can
> >> >>> >>> >> build such an engine we would need to
> >> >>> >>> >>
> >> >>> >>> >> * extend the Stanbol NLP processing API with Annotations for
> >> >>> >>> co-reference
> >> >>> >>> >> * add support for JSON Serialisation/Parsing for those
> >> annotation
> >> >>> so
> >> >>> >>> >> that the RESTful NLP Analysis Service can provide
> co-reference
> >> >>> >>> >> information
> >> >>> >>> >>
> >> >>> >>> >> > At this moment I'll be focusing on 2 aspects:
> >> >>> >>> >> >
> >> >>> >>> >> > 1. Determine the best data structure to encapsulate the
> >> extracted
> >> >>> >>> >> > information. I'll take a closer look at Dolce.
> >> >>> >>> >>
> >> >>> >>> >> Don't make to to complex. Defining a proper structure to
> >> represent
> >> >>> >>> >> Events will only pay-off if we can also successfully extract
> >> such
> >> >>> >>> >> information form processed texts.
> >> >>> >>> >>
> >> >>> >>> >> I would start with
> >> >>> >>> >>
> >> >>> >>> >>  * fise:SettingAnnotation
> >> >>> >>> >>     * {fise:Enhancement} metadata
> >> >>> >>> >>
> >> >>> >>> >>  * fise:ParticipantAnnotation
> >> >>> >>> >>     * {fise:Enhancement} metadata
> >> >>> >>> >>     * fise:inSetting {settingAnnotation}
> >> >>> >>> >>     * fise:hasMention {textAnnotation}
> >> >>> >>> >>     * fise:suggestion {entityAnnotation} (multiple if there
> are
> >> >>> more
> >> >>> >>> >> suggestions)
> >> >>> >>> >>     * dc:type one of fise:Agent, fise:Patient,
> fise:Instrument,
> >> >>> >>> fise:Cause
> >> >>> >>> >>
> >> >>> >>> >>  * fise:OccurrentAnnotation
> >> >>> >>> >>     * {fise:Enhancement} metadata
> >> >>> >>> >>     * fise:inSetting {settingAnnotation}
> >> >>> >>> >>     * fise:hasMention {textAnnotation}
> >> >>> >>> >>     * dc:type set to fise:Activity
> >> >>> >>> >>
> >> >>> >>> >> If it turns out that we can extract more, we can add more
> >> >>> structure to
> >> >>> >>> >> those annotations. We might also think about using an own
> >> namespace
> >> >>> >>> >> for those extensions to the annotation structure.
> >> >>> >>> >>
> >> >>> >>> >> > 2. Determine how should all of this be integrated into
> >> Stanbol.
> >> >>> >>> >>
> >> >>> >>> >> Just create an EventExtractionEngine and configure a
> >> enhancement
> >> >>> chain
> >> >>> >>> >> that does NLP processing and EntityLinking.
> >> >>> >>> >>
> >> >>> >>> >> You should have a look at
> >> >>> >>> >>
> >> >>> >>> >> * SentimentSummarizationEngine [1] as it does a lot of things
> >> with
> >> >>> NLP
> >> >>> >>> >> processing results (e.g. connecting adjectives (via verbs) to
> >> >>> >>> >> nouns/pronouns. So as long we can not use explicit dependency
> >> trees
> >> >>> >>> >> you code will need to do similar things with Nouns, Pronouns
> >> and
> >> >>> >>> >> Verbs.
> >> >>> >>> >>
> >> >>> >>> >> * Disambigutation-MLT engine, as it creates a Java
> >> representation
> >> >>> of
> >> >>> >>> >> present fise:TextAnnotation and fise:EntityAnnotation [2].
> >> >>> Something
> >> >>> >>> >> similar will also be required by the EventExtractionEngine
> for
> >> fast
> >> >>> >>> >> access to such annotations while iterating over the Sentences
> >> of
> >> >>> the
> >> >>> >>> >> text.
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >> best
> >> >>> >>> >> Rupert
> >> >>> >>> >>
> >> >>> >>> >> [1]
> >> >>> >>> >>
> >> >>> >>>
> >> >>>
> >>
> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/sentiment-summarization/src/main/java/org/apache/stanbol/enhancer/engines/sentiment/summarize/SentimentSummarizationEngine.java
> >> >>> >>> >> [2]
> >> >>> >>> >>
> >> >>> >>>
> >> >>>
> >>
> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/disambiguation-mlt/src/main/java/org/apache/stanbol/enhancer/engine/disambiguation/mlt/DisambiguationData.java
> >> >>> >>> >>
> >> >>> >>> >> >
> >> >>> >>> >> > Thanks
> >> >>> >>> >> >
> >> >>> >>> >> > Hope this helps to bootstrap this discussion
> >> >>> >>> >> >> best
> >> >>> >>> >> >> Rupert
> >> >>> >>> >> >>
> >> >>> >>> >> >> --
> >> >>> >>> >> >> | Rupert Westenthaler
> >> [email protected]
> >> >>> >>> >> >> | Bodenlehenstraße 11
> >> >>> ++43-699-11108907
> >> >>> >>> >> >> | A-5500 Bischofshofen
> >> >>> >>> >> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >> --
> >> >>> >>> >> | Rupert Westenthaler
> >> [email protected]
> >> >>> >>> >> | Bodenlehenstraße 11
> >> >>> ++43-699-11108907
> >> >>> >>> >> | A-5500 Bischofshofen
> >> >>> >>> >>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> --
> >> >>> >>> | Rupert Westenthaler             [email protected]
> >> >>> >>> | Bodenlehenstraße 11
> >> ++43-699-11108907
> >> >>> >>> | A-5500 Bischofshofen
> >> >>> >>>
> >> >>> >>
> >> >>> >>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> | Rupert Westenthaler             [email protected]
> >> >>> | Bodenlehenstraße 11                             ++43-699-11108907
> >> >>> | A-5500 Bischofshofen
> >> >>>
> >> >>
> >> >>
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler             [email protected]
> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> | A-5500 Bischofshofen
> >>
> >
> >
>

Re: Relation extraction feature

Reply via email to