Hi Cristian, In fact I missed it. Sorry for that.
I think the revised proposal looks like a good start. Usually one needs make some adaptions when writing the actual code. If you have a first version attach it to an issue and I will commit it to the branch. best Rupert On Thu, Sep 12, 2013 at 9:04 AM, Cristian Petroaca <cristian.petro...@gmail.com> wrote: > Hi Rupert, > > This is a reminder in case you missed this e-mail. > > Cristian > > > 2013/9/3 Cristian Petroaca <cristian.petro...@gmail.com> > >> Ok, then to sum it up we would have : >> >> 1. Coref >> >> "stanbol.enhancer.nlp.coref" { >> "isRepresentative" : true/false, // whether this token or chunk is the >> representative mention in the chain >> "mentions" : [ { "type" : "Token", // type of element which refers to >> this token/chunk >> "start": 123 , // start index of the mentioning element >> "end": 130 // end index of the mentioning element >> }, ... >> ], >> "class" : ""class" : "org.apache.stanbol.enhancer.nlp.coref.CorefTag" >> } >> >> >> 2. Dependency tree >> >> "stanbol.enhancer.nlp.dependency" : { >> "relations" : [ { "tag" : "nsubj", //type of relation - Stanford NLP >> notation >> "dep" : 12, // type of relation - Stanbol NLP >> mapped value - ordinal number in enum Dependency >> "role" : "gov/dep", // whether this token is the depender or the dependee >> "type" : "Token", // type of element with which this token is in relation >> "start" : 123, // start index of the relating token >> "end" : 130 // end index of the relating token >> }, >> ... >> ] >> "class" : "org.apache.stanbol.enhancer.nlp.dependency.DependencyTag" >> } >> >> >> 2013/9/2 Rupert Westenthaler <rupert.westentha...@gmail.com> >> >>> Hi Cristian, >>> >>> let me provide some feedback to your proposals: >>> >>> ### Referring other Spans >>> >>> Both suggested annotations require to link other spans (Sentence, >>> Chunk or Token). For that we should introduce a JSON element used for >>> referring those elements and use it for all usages. >>> >>> In the java model this would allow you to have a reference to the >>> other Span (Sentence, Chunk, Token). In the serialized form you would >>> have JSON elements with the "type", "start" and "end" attributes as >>> those three uniquely identify any span. >>> >>> Here an example based on the "mention" attribute as defined by the >>> proposed "org.apache.stanbol.enhancer.nlp.coref.CorefTag" >>> >>> ... >>> "mentions" : [ { >>> "type" : "Token", >>> "start": 123 , >>> "end": 130 } ,{ >>> "type" : "Token", >>> "start": 157 , >>> "end": 165 }], >>> ... >>> >>> Similar token links in >>> "org.apache.stanbol.enhancer.nlp.dependency.DependencyTag" should also >>> use this model. >>> >>> ### Usage of Controlled Vocabularies >>> >>> In addition the DependencyTag also seams to use a controlled >>> vocabulary (e.g. 'nsubj', 'conj_and' ...). In such cases the Stanbol >>> NLP module tries to define those in some kind of Ontology. For POS >>> tags we use OLIA ontology [1]. This is important as most NLP >>> frameworks will use different strings and we need to unify those to >>> commons IDs so that component that consume those data do not depend on >>> a specific NLP tool. >>> >>> Because the usage of Ontologies within Java is not well supported. The >>> Stanbol NLP module defines Java Enumerations for those Ontologies such >>> as the POS type enumeration [2]. >>> >>> Both the Java Model as well as the JSON serialization do support both >>> (1) the lexical tag as used by the NLP tool and (2) the mapped >>> concept. In the Java API via two different methods and in the JSON >>> serialization via two separate keys. >>> >>> To make this more clear here an example for a POS annotation of a proper >>> noun. >>> >>> "stanbol.enhancer.nlp.pos" : { >>> "tag" : "PN", >>> "pos" : 53, >>> "class" : "org.apache.stanbol.enhancer.nlp.pos.PosTag", >>> "prob" : 0.95 >>> } >>> >>> where >>> >>> "tag" : "PN" >>> >>> is the lexical form as used by the NLP tool and >>> >>> "pos" : 53 >>> >>> refers to the ordinal number of the entry "ProperNoun" in the POS >>> enumeration >>> >>> IMO the "type" property of DependencyTag should use a similar design. >>> >>> best >>> Rupert >>> >>> [1] http://olia.nlp2rdf.org/ >>> [2] >>> http://svn.apache.org/repos/asf/stanbol/trunk/enhancer/generic/nlp/src/main/java/org/apache/stanbol/enhancer/nlp/pos/Pos.java >>> >>> On Sun, Sep 1, 2013 at 8:09 PM, Cristian Petroaca >>> <cristian.petro...@gmail.com> wrote: >>> > Sorry, pressed sent too soon :). >>> > >>> > Continued : >>> > >>> > nsubj(met-4, Mary-1), conj_and(Mary-1, Tom-3), nsubj(met-4, Tom-3), >>> > root(ROOT-0, met-4), nn(today-6, Danny-5), tmod(met-4, today-6)] >>> > >>> > Given this, we can have for each "Token" an additional dependency >>> > annotation : >>> > >>> > "stanbol.enhancer.nlp.dependency" : { >>> > "tag" : //is it necessary? >>> > "relations" : [ { "type" : "nsubj", //type of relation >>> > "role" : "gov/dep", //whether it is depender or the dependee >>> > "dependencyValue" : "met", // the word with which the token has a >>> relation >>> > "dependencyIndexInSentence" : "2" //the index of the dependency in the >>> > current sentence >>> > } >>> > ... >>> > ] >>> > "class" : >>> > "org.apache.stanbol.enhancer.nlp.dependency.DependencyTag" >>> > } >>> > >>> > 2013/9/1 Cristian Petroaca <cristian.petro...@gmail.com> >>> > >>> >> Related to the Stanford Dependency Tree Feature, this is the way the >>> >> output from the tool looks like for this sentence : "Mary and Tom met >>> Danny >>> >> today" : >>> >> >>> >> >>> >> 2013/8/30 Cristian Petroaca <cristian.petro...@gmail.com> >>> >> >>> >>> Hi Rupert, >>> >>> >>> >>> Ok, so after looking at the JSON output from the Stanford NLP Server >>> and >>> >>> the coref module I'm thinking I can represent the coreference >>> information >>> >>> this way: >>> >>> Each "Token" or "Chunk" will contain an additional coref annotation >>> with >>> >>> the following structure : >>> >>> >>> >>> "stanbol.enhancer.nlp.coref" { >>> >>> "tag" : //does this need to exist? >>> >>> "isRepresentative" : true/false, // whether this token or chunk is >>> >>> the representative mention in the chain >>> >>> "mentions" : [ { "sentenceNo" : 1 //the sentence in which the >>> mention >>> >>> is found >>> >>> "startWord" : 2 //the first word making up >>> the >>> >>> mention >>> >>> "endWord" : 3 //the last word making up the >>> >>> mention >>> >>> }, ... >>> >>> ], >>> >>> "class" : ""class" : >>> "org.apache.stanbol.enhancer.nlp.coref.CorefTag" >>> >>> } >>> >>> >>> >>> The CorefTag should resemble this model. >>> >>> >>> >>> What do you think? >>> >>> >>> >>> Cristian >>> >>> >>> >>> >>> >>> 2013/8/24 Rupert Westenthaler <rupert.westentha...@gmail.com> >>> >>> >>> >>>> Hi Cristian, >>> >>>> >>> >>>> you can not directly call StanfordNLP components from Stanbol, but >>> you >>> >>>> have to extend the RESTful service to include the information you >>> >>>> need. The main reason for that is that the license of StanfordNLP is >>> >>>> not compatible with the Apache Software License. So Stanbol can not >>> >>>> directly link to the StanfordNLP API. >>> >>>> >>> >>>> You will need to >>> >>>> >>> >>>> 1. define an additional class {yourTag} extends Tag<{yourType}> class >>> >>>> in the o.a.s.enhancer.nlp module >>> >>>> 2. add JSON parsing and serialization support for this tag to the >>> >>>> o.a.s.enhancer.nlp.json module (see e.g. PosTagSupport as an example) >>> >>>> >>> >>>> As (1) would be necessary anyway the only additional thing you need >>> to >>> >>>> develop is (2). After that you can add {yourTag} instance to the >>> >>>> AnalyzedText in the StanfornNLP integration. The >>> >>>> RestfulNlpAnalysisEngine will parse them from the response. All >>> >>>> engines executed after the RestfulNlpAnalysisEngine will have access >>> >>>> to your annotations. >>> >>>> >>> >>>> If you have a design for {yourTag} - the model you would like to use >>> >>>> to represent your data - I can help with (1) and (2). >>> >>>> >>> >>>> best >>> >>>> Rupert >>> >>>> >>> >>>> >>> >>>> On Fri, Aug 23, 2013 at 5:11 PM, Cristian Petroaca >>> >>>> <cristian.petro...@gmail.com> wrote: >>> >>>> > Hi Rupert, >>> >>>> > >>> >>>> > Thanks for the info. Looking at the standbol-stanfordnlp project I >>> see >>> >>>> that >>> >>>> > the stanford nlp is not implemented as an EnhancementEngine but >>> rather >>> >>>> it >>> >>>> > is used directly in a Jetty Server instance. How does that fit >>> into the >>> >>>> > Stanbol stack? For example how can I call the StanfordNlpAnalyzer's >>> >>>> routine >>> >>>> > from my TripleExtractionEnhancementEngine which lives in the >>> Stanbol >>> >>>> stack? >>> >>>> > >>> >>>> > Thanks, >>> >>>> > Cristian >>> >>>> > >>> >>>> > >>> >>>> > 2013/8/12 Rupert Westenthaler <rupert.westentha...@gmail.com> >>> >>>> > >>> >>>> >> Hi Cristian, >>> >>>> >> >>> >>>> >> Sorry for the late response, but I was offline for the last two >>> weeks >>> >>>> >> >>> >>>> >> On Fri, Aug 2, 2013 at 9:19 PM, Cristian Petroaca >>> >>>> >> <cristian.petro...@gmail.com> wrote: >>> >>>> >> > Hi Rupert, >>> >>>> >> > >>> >>>> >> > After doing some tests it seems that the Stanford NLP >>> coreference >>> >>>> module >>> >>>> >> is >>> >>>> >> > much more accurate than the Open NLP one.So I decided to extend >>> >>>> Stanford >>> >>>> >> > NLP to add coreference there. >>> >>>> >> >>> >>>> >> The Stanford NLP integration is not part of the Stanbol codebase >>> >>>> >> because the licenses are not compatible. >>> >>>> >> >>> >>>> >> You can find the Stanford NLP integration on >>> >>>> >> >>> >>>> >> https://github.com/westei/stanbol-stanfordnlp >>> >>>> >> >>> >>>> >> just create a fork and send pull requests. >>> >>>> >> >>> >>>> >> >>> >>>> >> > Could you add the necessary projects on the branch? And also >>> remove >>> >>>> the >>> >>>> >> > Open NLP ones? >>> >>>> >> > >>> >>>> >> >>> >>>> >> Currently the branch >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >>> http://svn.apache.org/repos/asf/stanbol/branches/nlp-dep-tree-and-co-ref/ >>> >>>> >> >>> >>>> >> only contains the "nlp" and the "nlp-json" modules. IMO those >>> should >>> >>>> >> be enough for adding coreference support. >>> >>>> >> >>> >>>> >> IMO you will need to >>> >>>> >> >>> >>>> >> * add an model for representing coreference to the nlp module >>> >>>> >> * add parsing and serializing support to the nlp-json module >>> >>>> >> * add the implementation to your fork of the stanbol-stanfordnlp >>> >>>> project >>> >>>> >> >>> >>>> >> best >>> >>>> >> Rupert >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> > Thanks, >>> >>>> >> > Cristian >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > 2013/7/5 Rupert Westenthaler <rupert.westentha...@gmail.com> >>> >>>> >> > >>> >>>> >> >> Hi Cristian, >>> >>>> >> >> >>> >>>> >> >> I created the branch at >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >>> >>>> >>> http://svn.apache.org/repos/asf/stanbol/branches/nlp-dep-tree-and-co-ref/ >>> >>>> >> >> >>> >>>> >> >> ATM in contains only the "nlp" and "nlp-json" module. Let me >>> know >>> >>>> if >>> >>>> >> >> you would like to have more >>> >>>> >> >> >>> >>>> >> >> best >>> >>>> >> >> Rupert >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> On Thu, Jul 4, 2013 at 10:14 AM, Cristian Petroaca >>> >>>> >> >> <cristian.petro...@gmail.com> wrote: >>> >>>> >> >> > Hi Rupert, >>> >>>> >> >> > >>> >>>> >> >> > I created jiras : >>> >>>> https://issues.apache.org/jira/browse/STANBOL-1132and >>> >>>> >> >> > https://issues.apache.org/jira/browse/STANBOL-1133. The >>> >>>> original one >>> >>>> >> in >>> >>>> >> >> > dependent upon these. >>> >>>> >> >> > Please let me know when I can start using the branch. >>> >>>> >> >> > >>> >>>> >> >> > Thanks, >>> >>>> >> >> > Cristian >>> >>>> >> >> > >>> >>>> >> >> > >>> >>>> >> >> > 2013/6/27 Cristian Petroaca <cristian.petro...@gmail.com> >>> >>>> >> >> > >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> 2013/6/27 Rupert Westenthaler < >>> rupert.westentha...@gmail.com> >>> >>>> >> >> >> >>> >>>> >> >> >>> On Thu, Jun 27, 2013 at 3:12 PM, Cristian Petroaca >>> >>>> >> >> >>> <cristian.petro...@gmail.com> wrote: >>> >>>> >> >> >>> > Sorry, I meant the Stanbol NLP API, not Stanford in my >>> >>>> previous >>> >>>> >> >> e-mail. >>> >>>> >> >> >>> By >>> >>>> >> >> >>> > the way, does Open NLP have the ability to build >>> dependency >>> >>>> trees? >>> >>>> >> >> >>> > >>> >>>> >> >> >>> >>> >>>> >> >> >>> AFAIK OpenNLP does not provide this feature. >>> >>>> >> >> >>> >>> >>>> >> >> >> >>> >>>> >> >> >> Then , since the Stanford NLP lib is also integrated into >>> >>>> Stanbol, >>> >>>> >> I'll >>> >>>> >> >> >> take a look at how I can extend its integration to include >>> the >>> >>>> >> >> dependency >>> >>>> >> >> >> tree feature. >>> >>>> >> >> >> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >> > >>> >>>> >> >> >>> > 2013/6/23 Cristian Petroaca <cristian.petro...@gmail.com >>> > >>> >>>> >> >> >>> > >>> >>>> >> >> >>> >> Hi Rupert, >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >> I created jira >>> >>>> >> https://issues.apache.org/jira/browse/STANBOL-1121. >>> >>>> >> >> >>> >> As you suggested I would start with extending the >>> Stanford >>> >>>> NLP >>> >>>> >> with >>> >>>> >> >> >>> >> co-reference resolution but I think also with dependency >>> >>>> trees >>> >>>> >> >> because >>> >>>> >> >> >>> I >>> >>>> >> >> >>> >> also need to know the Subject of the sentence and the >>> object >>> >>>> >> that it >>> >>>> >> >> >>> >> affects, right? >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >> Given that I need to extend the Stanford NLP API in >>> Stanbol >>> >>>> for >>> >>>> >> >> >>> >> co-reference and dependency trees, how do I proceed with >>> >>>> this? >>> >>>> >> Do I >>> >>>> >> >> >>> create >>> >>>> >> >> >>> >> 2 new sub-tasks to the already opened Jira? After that >>> can I >>> >>>> >> start >>> >>>> >> >> >>> >> implementing on my local copy of Stanbol and when I'm >>> done >>> >>>> I'll >>> >>>> >> send >>> >>>> >> >> >>> you >>> >>>> >> >> >>> >> guys the patch fo review? >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >>> >>>> >> >> >>> I would create two "New Feature" type Issues one for adding >>> >>>> support >>> >>>> >> >> >>> for "dependency trees" and the other for "co-reference" >>> >>>> support. You >>> >>>> >> >> >>> should also define "depends on" relations between >>> STANBOL-1121 >>> >>>> and >>> >>>> >> >> >>> those two new issues. >>> >>>> >> >> >>> >>> >>>> >> >> >>> Sub-task could also work, but as adding those features >>> would >>> >>>> be also >>> >>>> >> >> >>> interesting for other things I would rather define them as >>> >>>> separate >>> >>>> >> >> >>> issues. >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >> 2 New Features connected with the original jira it is then. >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >>> If you would prefer to work in an own branch please tell >>> me. >>> >>>> This >>> >>>> >> >> >>> could have the advantage that patches would not be >>> affected by >>> >>>> >> changes >>> >>>> >> >> >>> in the trunk. >>> >>>> >> >> >>> >>> >>>> >> >> >>> Yes, a separate branch sounds good. >>> >>>> >> >> >> >>> >>>> >> >> >> best >>> >>>> >> >> >>> Rupert >>> >>>> >> >> >>> >>> >>>> >> >> >>> >> Regards, >>> >>>> >> >> >>> >> Cristian >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >> 2013/6/18 Rupert Westenthaler < >>> >>>> rupert.westentha...@gmail.com> >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >>> On Mon, Jun 17, 2013 at 10:18 PM, Cristian Petroaca >>> >>>> >> >> >>> >>> <cristian.petro...@gmail.com> wrote: >>> >>>> >> >> >>> >>> > Hi Rupert, >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> > Agreed on the >>> >>>> >> >> >>> SettingAnnotation/ParticipantAnnotation/OccurentAnnotation >>> >>>> >> >> >>> >>> > data structure. >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> > Should I open up a Jira for all of this in order to >>> >>>> >> encapsulate >>> >>>> >> >> this >>> >>>> >> >> >>> >>> > information and establish the goals and these initial >>> >>>> steps >>> >>>> >> >> towards >>> >>>> >> >> >>> >>> these >>> >>>> >> >> >>> >>> > goals? >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> Yes please. A JIRA issue for this work would be great. >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> > How should I proceed further? Should I create some >>> design >>> >>>> >> >> documents >>> >>>> >> >> >>> that >>> >>>> >> >> >>> >>> > need to be reviewed? >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> Usually it is the best to write design related text >>> >>>> directly in >>> >>>> >> >> JIRA >>> >>>> >> >> >>> >>> by using Markdown [1] syntax. This will allow us later >>> to >>> >>>> use >>> >>>> >> this >>> >>>> >> >> >>> >>> text directly for the documentation on the Stanbol >>> Webpage. >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> best >>> >>>> >> >> >>> >>> Rupert >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> [1] http://daringfireball.net/projects/markdown/ >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> > Regards, >>> >>>> >> >> >>> >>> > Cristian >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> > 2013/6/17 Rupert Westenthaler < >>> >>>> rupert.westentha...@gmail.com> >>> >>>> >> >> >>> >>> > >>> >>>> >> >> >>> >>> >> On Thu, Jun 13, 2013 at 8:22 PM, Cristian Petroaca >>> >>>> >> >> >>> >>> >> <cristian.petro...@gmail.com> wrote: >>> >>>> >> >> >>> >>> >> > HI Rupert, >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > First of all thanks for the detailed suggestions. >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > 2013/6/12 Rupert Westenthaler < >>> >>>> >> rupert.westentha...@gmail.com> >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> >> Hi Cristian, all >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> really interesting use case! >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> In this mail I will try to give some suggestions >>> on >>> >>>> how >>> >>>> >> this >>> >>>> >> >> >>> could >>> >>>> >> >> >>> >>> >> >> work out. This suggestions are mainly based on >>> >>>> experiences >>> >>>> >> >> and >>> >>>> >> >> >>> >>> lessons >>> >>>> >> >> >>> >>> >> >> learned in the LIVE [2] project where we built an >>> >>>> >> information >>> >>>> >> >> >>> system >>> >>>> >> >> >>> >>> >> >> for the Olympic Games in Peking. While this >>> Project >>> >>>> >> excluded >>> >>>> >> >> the >>> >>>> >> >> >>> >>> >> >> extraction of Events from unstructured text >>> (because >>> >>>> the >>> >>>> >> >> Olympic >>> >>>> >> >> >>> >>> >> >> Information System was already providing event >>> data >>> >>>> as XML >>> >>>> >> >> >>> messages) >>> >>>> >> >> >>> >>> >> >> the semantic search capabilities of this system >>> >>>> where very >>> >>>> >> >> >>> similar >>> >>>> >> >> >>> >>> as >>> >>>> >> >> >>> >>> >> >> the one described by your use case. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> IMHO you are not only trying to extract >>> relations, >>> >>>> but a >>> >>>> >> >> formal >>> >>>> >> >> >>> >>> >> >> representation of the situation described by the >>> >>>> text. So >>> >>>> >> >> lets >>> >>>> >> >> >>> >>> assume >>> >>>> >> >> >>> >>> >> >> that the goal is to Annotate a Setting (or >>> Situation) >>> >>>> >> >> described >>> >>>> >> >> >>> in >>> >>>> >> >> >>> >>> the >>> >>>> >> >> >>> >>> >> >> text - a fise:SettingAnnotation. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> The DOLCE foundational ontology [1] gives some >>> >>>> advices on >>> >>>> >> >> how to >>> >>>> >> >> >>> >>> model >>> >>>> >> >> >>> >>> >> >> those. The important relation for modeling this >>> >>>> >> >> Participation: >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> PC(x, y, t) → (ED(x) ∧ PD(y) ∧ T(t)) >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> where .. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> * ED are Endurants (continuants): Endurants do >>> have >>> >>>> an >>> >>>> >> >> >>> identity so >>> >>>> >> >> >>> >>> we >>> >>>> >> >> >>> >>> >> >> would typically refer to them as Entities >>> referenced >>> >>>> by a >>> >>>> >> >> >>> setting. >>> >>>> >> >> >>> >>> >> >> Note that this includes physical, non-physical as >>> >>>> well as >>> >>>> >> >> >>> >>> >> >> social-objects. >>> >>>> >> >> >>> >>> >> >> * PD are Perdurants (occurrents): Perdurants >>> are >>> >>>> >> entities >>> >>>> >> >> that >>> >>>> >> >> >>> >>> >> >> happen in time. This refers to Events, >>> Activities ... >>> >>>> >> >> >>> >>> >> >> * PC are Participation: It is an time indexed >>> >>>> relation >>> >>>> >> where >>> >>>> >> >> >>> >>> >> >> Endurants participate in Perdurants >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> Modeling this in RDF requires to define some >>> >>>> intermediate >>> >>>> >> >> >>> resources >>> >>>> >> >> >>> >>> >> >> because RDF does not allow for n-ary relations. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> * fise:SettingAnnotation: It is really handy to >>> >>>> define >>> >>>> >> one >>> >>>> >> >> >>> resource >>> >>>> >> >> >>> >>> >> >> being the context for all described data. I would >>> >>>> call >>> >>>> >> this >>> >>>> >> >> >>> >>> >> >> "fise:SettingAnnotation" and define it as a >>> >>>> sub-concept to >>> >>>> >> >> >>> >>> >> >> fise:Enhancement. All further enhancement about >>> the >>> >>>> >> extracted >>> >>>> >> >> >>> >>> Setting >>> >>>> >> >> >>> >>> >> >> would define a "fise:in-setting" relation to it. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> * fise:ParticipantAnnotation: Is used to >>> annotate >>> >>>> that >>> >>>> >> >> >>> Endurant is >>> >>>> >> >> >>> >>> >> >> participating on a setting (fise:in-setting >>> >>>> >> >> >>> fise:SettingAnnotation). >>> >>>> >> >> >>> >>> >> >> The Endurant itself is described by existing >>> >>>> >> >> fise:TextAnnotaion >>> >>>> >> >> >>> (the >>> >>>> >> >> >>> >>> >> >> mentions) and fise:EntityAnnotation (suggested >>> >>>> Entities). >>> >>>> >> >> >>> Basically >>> >>>> >> >> >>> >>> >> >> the fise:ParticipantAnnotation will allow an >>> >>>> >> >> EnhancementEngine >>> >>>> >> >> >>> to >>> >>>> >> >> >>> >>> >> >> state that several mentions (in possible >>> different >>> >>>> >> >> sentences) do >>> >>>> >> >> >>> >>> >> >> represent the same Endurant as participating in >>> the >>> >>>> >> Setting. >>> >>>> >> >> In >>> >>>> >> >> >>> >>> >> >> addition it would be possible to use the dc:type >>> >>>> property >>> >>>> >> >> >>> (similar >>> >>>> >> >> >>> >>> as >>> >>>> >> >> >>> >>> >> >> for fise:TextAnnotation) to refer to the role(s) >>> of >>> >>>> an >>> >>>> >> >> >>> participant >>> >>>> >> >> >>> >>> >> >> (e.g. the set: Agent (intensionally performs an >>> >>>> action) >>> >>>> >> Cause >>> >>>> >> >> >>> >>> >> >> (unintentionally e.g. a mud slide), Patient (a >>> >>>> passive >>> >>>> >> role >>> >>>> >> >> in >>> >>>> >> >> >>> an >>> >>>> >> >> >>> >>> >> >> activity) and Instrument (aids an process)), but >>> I am >>> >>>> >> >> wondering >>> >>>> >> >> >>> if >>> >>>> >> >> >>> >>> one >>> >>>> >> >> >>> >>> >> >> could extract those information. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> * fise:OccurrentAnnotation: is used to annotate a >>> >>>> >> Perdurant >>> >>>> >> >> in >>> >>>> >> >> >>> the >>> >>>> >> >> >>> >>> >> >> context of the Setting. Also >>> >>>> fise:OccurrentAnnotation can >>> >>>> >> >> link >>> >>>> >> >> >>> to >>> >>>> >> >> >>> >>> >> >> fise:TextAnnotaion (typically verbs in the text >>> >>>> defining >>> >>>> >> the >>> >>>> >> >> >>> >>> >> >> perdurant) as well as fise:EntityAnnotation >>> >>>> suggesting >>> >>>> >> well >>> >>>> >> >> >>> known >>> >>>> >> >> >>> >>> >> >> Events in a knowledge base (e.g. a Election in a >>> >>>> country, >>> >>>> >> or >>> >>>> >> >> an >>> >>>> >> >> >>> >>> >> >> upraising ...). In addition >>> fise:OccurrentAnnotation >>> >>>> can >>> >>>> >> >> define >>> >>>> >> >> >>> >>> >> >> dc:has-participant links to >>> >>>> fise:ParticipantAnnotation. In >>> >>>> >> >> this >>> >>>> >> >> >>> case >>> >>>> >> >> >>> >>> >> >> it is explicitly stated hat an Endurant (the >>> >>>> >> >> >>> >>> >> >> fise:ParticipantAnnotation) involved in this >>> >>>> Perturant >>> >>>> >> (the >>> >>>> >> >> >>> >>> >> >> fise:OccurrentAnnotation). As Occurrences are >>> >>>> temporal >>> >>>> >> >> indexed >>> >>>> >> >> >>> this >>> >>>> >> >> >>> >>> >> >> annotation should also support properties for >>> >>>> defining the >>> >>>> >> >> >>> >>> >> >> xsd:dateTime for the start/end. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> Indeed, an event based data structure makes a >>> lot of >>> >>>> sense >>> >>>> >> >> with >>> >>>> >> >> >>> the >>> >>>> >> >> >>> >>> >> remark >>> >>>> >> >> >>> >>> >> > that you probably won't be able to always extract >>> the >>> >>>> date >>> >>>> >> >> for a >>> >>>> >> >> >>> >>> given >>> >>>> >> >> >>> >>> >> > setting(situation). >>> >>>> >> >> >>> >>> >> > There are 2 thing which are unclear though. >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > 1. Perdurant : You could have situations in which >>> the >>> >>>> >> object >>> >>>> >> >> upon >>> >>>> >> >> >>> >>> which >>> >>>> >> >> >>> >>> >> the >>> >>>> >> >> >>> >>> >> > Subject ( or Endurant ) is acting is not a >>> transitory >>> >>>> >> object ( >>> >>>> >> >> >>> such >>> >>>> >> >> >>> >>> as an >>> >>>> >> >> >>> >>> >> > event, activity ) but rather another Endurant. For >>> >>>> example >>> >>>> >> we >>> >>>> >> >> can >>> >>>> >> >> >>> >>> have >>> >>>> >> >> >>> >>> >> the >>> >>>> >> >> >>> >>> >> > phrase "USA invades Irak" where "USA" is the >>> Endurant >>> >>>> ( >>> >>>> >> >> Subject ) >>> >>>> >> >> >>> >>> which >>> >>>> >> >> >>> >>> >> > performs the action of "invading" on another >>> >>>> Eundurant, >>> >>>> >> namely >>> >>>> >> >> >>> >>> "Irak". >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> By using CAOS, USA would be the Agent and Iraq the >>> >>>> Patient. >>> >>>> >> Both >>> >>>> >> >> >>> are >>> >>>> >> >> >>> >>> >> Endurants. The activity "invading" would be the >>> >>>> Perdurant. So >>> >>>> >> >> >>> ideally >>> >>>> >> >> >>> >>> >> you would have a "fise:SettingAnnotation" with: >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * fise:ParticipantAnnotation for USA with the >>> dc:type >>> >>>> >> >> caos:Agent, >>> >>>> >> >> >>> >>> >> linking to a fise:TextAnnotation for "USA" and a >>> >>>> >> >> >>> fise:EntityAnnotation >>> >>>> >> >> >>> >>> >> linking to dbpedia:United_States >>> >>>> >> >> >>> >>> >> * fise:ParticipantAnnotation for Iraq with the >>> dc:type >>> >>>> >> >> >>> caos:Patient, >>> >>>> >> >> >>> >>> >> linking to a fise:TextAnnotation for "Irak" and a >>> >>>> >> >> >>> >>> >> fise:EntityAnnotation linking to dbpedia:Iraq >>> >>>> >> >> >>> >>> >> * fise:OccurrentAnnotation for "invades" with the >>> >>>> dc:type >>> >>>> >> >> >>> >>> >> caos:Activity, linking to a fise:TextAnnotation for >>> >>>> "invades" >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> > 2. Where does the verb, which links the Subject >>> and >>> >>>> the >>> >>>> >> Object >>> >>>> >> >> >>> come >>> >>>> >> >> >>> >>> into >>> >>>> >> >> >>> >>> >> > this? I imagined that the Endurant would have a >>> >>>> >> dc:"property" >>> >>>> >> >> >>> where >>> >>>> >> >> >>> >>> the >>> >>>> >> >> >>> >>> >> > property = verb which links to the Object in noun >>> >>>> form. For >>> >>>> >> >> >>> example >>> >>>> >> >> >>> >>> take >>> >>>> >> >> >>> >>> >> > again the sentence "USA invades Irak". You would >>> have >>> >>>> the >>> >>>> >> >> "USA" >>> >>>> >> >> >>> >>> Entity >>> >>>> >> >> >>> >>> >> with >>> >>>> >> >> >>> >>> >> > dc:invader which points to the Object "Irak". The >>> >>>> Endurant >>> >>>> >> >> would >>> >>>> >> >> >>> >>> have as >>> >>>> >> >> >>> >>> >> > many dc:"property" elements as there are verbs >>> which >>> >>>> link >>> >>>> >> it >>> >>>> >> >> to >>> >>>> >> >> >>> an >>> >>>> >> >> >>> >>> >> Object. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> As explained above you would have a >>> >>>> fise:OccurrentAnnotation >>> >>>> >> >> that >>> >>>> >> >> >>> >>> >> represents the Perdurant. The information that the >>> >>>> activity >>> >>>> >> >> >>> mention in >>> >>>> >> >> >>> >>> >> the text is "invades" would be by linking to a >>> >>>> >> >> >>> fise:TextAnnotation. If >>> >>>> >> >> >>> >>> >> you can also provide an Ontology for Tasks that >>> defines >>> >>>> >> >> >>> >>> >> "myTasks:invade" the fise:OccurrentAnnotation could >>> >>>> also link >>> >>>> >> >> to an >>> >>>> >> >> >>> >>> >> fise:EntityAnnotation for this concept. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> best >>> >>>> >> >> >>> >>> >> Rupert >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > ### Consuming the data: >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> I think this model should be sufficient for >>> >>>> use-cases as >>> >>>> >> >> >>> described >>> >>>> >> >> >>> >>> by >>> >>>> >> >> >>> >>> >> you. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> Users would be able to consume data on the >>> setting >>> >>>> level. >>> >>>> >> >> This >>> >>>> >> >> >>> can >>> >>>> >> >> >>> >>> be >>> >>>> >> >> >>> >>> >> >> done my simple retrieving all >>> >>>> fise:ParticipantAnnotation >>> >>>> >> as >>> >>>> >> >> >>> well as >>> >>>> >> >> >>> >>> >> >> fise:OccurrentAnnotation linked with a setting. >>> BTW >>> >>>> this >>> >>>> >> was >>> >>>> >> >> the >>> >>>> >> >> >>> >>> >> >> approach used in LIVE [2] for semantic search. It >>> >>>> allows >>> >>>> >> >> >>> queries for >>> >>>> >> >> >>> >>> >> >> Settings that involve specific Entities e.g. you >>> >>>> could >>> >>>> >> filter >>> >>>> >> >> >>> for >>> >>>> >> >> >>> >>> >> >> Settings that involve a {Person}, >>> >>>> activities:Arrested and >>> >>>> >> a >>> >>>> >> >> >>> specific >>> >>>> >> >> >>> >>> >> >> {Upraising}. However note that with this approach >>> >>>> you will >>> >>>> >> >> get >>> >>>> >> >> >>> >>> results >>> >>>> >> >> >>> >>> >> >> for Setting where the {Person} participated and >>> an >>> >>>> other >>> >>>> >> >> person >>> >>>> >> >> >>> was >>> >>>> >> >> >>> >>> >> >> arrested. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> An other possibility would be to process >>> enhancement >>> >>>> >> results >>> >>>> >> >> on >>> >>>> >> >> >>> the >>> >>>> >> >> >>> >>> >> >> fise:OccurrentAnnotation. This would allow to a >>> much >>> >>>> >> higher >>> >>>> >> >> >>> >>> >> >> granularity level (e.g. it would allow to >>> correctly >>> >>>> answer >>> >>>> >> >> the >>> >>>> >> >> >>> query >>> >>>> >> >> >>> >>> >> >> used as an example above). But I am wondering if >>> the >>> >>>> >> quality >>> >>>> >> >> of >>> >>>> >> >> >>> the >>> >>>> >> >> >>> >>> >> >> Setting extraction will be sufficient for this. I >>> >>>> have >>> >>>> >> also >>> >>>> >> >> >>> doubts >>> >>>> >> >> >>> >>> if >>> >>>> >> >> >>> >>> >> >> this can be still realized by using semantic >>> >>>> indexing to >>> >>>> >> >> Apache >>> >>>> >> >> >>> Solr >>> >>>> >> >> >>> >>> >> >> or if it would be better/necessary to store >>> results >>> >>>> in a >>> >>>> >> >> >>> TripleStore >>> >>>> >> >> >>> >>> >> >> and using SPARQL for retrieval. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> The methodology and query language used by YAGO >>> [3] >>> >>>> is >>> >>>> >> also >>> >>>> >> >> very >>> >>>> >> >> >>> >>> >> >> relevant for this (especially note chapter 7 >>> SPOTL(X) >>> >>>> >> >> >>> >>> Representation). >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> An other related Topic is the enrichment of >>> Entities >>> >>>> >> >> (especially >>> >>>> >> >> >>> >>> >> >> Events) in knowledge bases based on Settings >>> >>>> extracted >>> >>>> >> form >>> >>>> >> >> >>> >>> Documents. >>> >>>> >> >> >>> >>> >> >> As per definition - in DOLCE - Perdurants are >>> >>>> temporal >>> >>>> >> >> indexed. >>> >>>> >> >> >>> That >>> >>>> >> >> >>> >>> >> >> means that at the time when added to a knowledge >>> >>>> base they >>> >>>> >> >> might >>> >>>> >> >> >>> >>> still >>> >>>> >> >> >>> >>> >> >> be in process. So the creation, enriching and >>> >>>> refinement >>> >>>> >> of >>> >>>> >> >> such >>> >>>> >> >> >>> >>> >> >> Entities in a the knowledge base seams to be >>> >>>> critical for >>> >>>> >> a >>> >>>> >> >> >>> System >>> >>>> >> >> >>> >>> >> >> like described in your use-case. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> On Tue, Jun 11, 2013 at 9:09 PM, Cristian >>> Petroaca >>> >>>> >> >> >>> >>> >> >> <cristian.petro...@gmail.com> wrote: >>> >>>> >> >> >>> >>> >> >> > >>> >>>> >> >> >>> >>> >> >> > First of all I have to mention that I am new >>> in the >>> >>>> >> field >>> >>>> >> >> of >>> >>>> >> >> >>> >>> semantic >>> >>>> >> >> >>> >>> >> >> > technologies, I've started to read about them >>> in >>> >>>> the >>> >>>> >> last >>> >>>> >> >> 4-5 >>> >>>> >> >> >>> >>> >> >> months.Having >>> >>>> >> >> >>> >>> >> >> > said that I have a high level overview of what >>> is >>> >>>> a good >>> >>>> >> >> >>> approach >>> >>>> >> >> >>> >>> to >>> >>>> >> >> >>> >>> >> >> solve >>> >>>> >> >> >>> >>> >> >> > this problem. There are a number of papers on >>> the >>> >>>> >> internet >>> >>>> >> >> >>> which >>> >>>> >> >> >>> >>> >> describe >>> >>>> >> >> >>> >>> >> >> > what steps need to be taken such as : named >>> entity >>> >>>> >> >> >>> recognition, >>> >>>> >> >> >>> >>> >> >> > co-reference resolution, pos tagging and >>> others. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> The Stanbol NLP processing module currently only >>> >>>> supports >>> >>>> >> >> >>> sentence >>> >>>> >> >> >>> >>> >> >> detection, tokenization, POS tagging, Chunking, >>> NER >>> >>>> and >>> >>>> >> >> lemma. >>> >>>> >> >> >>> >>> support >>> >>>> >> >> >>> >>> >> >> for co-reference resolution and dependency trees >>> is >>> >>>> >> currently >>> >>>> >> >> >>> >>> missing. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> Stanford NLP is already integrated with Stanbol >>> [4]. >>> >>>> At >>> >>>> >> the >>> >>>> >> >> >>> moment >>> >>>> >> >> >>> >>> it >>> >>>> >> >> >>> >>> >> >> only supports English, but I do already work to >>> >>>> include >>> >>>> >> the >>> >>>> >> >> >>> other >>> >>>> >> >> >>> >>> >> >> supported languages. Other NLP framework that is >>> >>>> already >>> >>>> >> >> >>> integrated >>> >>>> >> >> >>> >>> >> >> with Stanbol are Freeling [5] and Talismane [6]. >>> But >>> >>>> note >>> >>>> >> >> that >>> >>>> >> >> >>> for >>> >>>> >> >> >>> >>> all >>> >>>> >> >> >>> >>> >> >> those the integration excludes support for >>> >>>> co-reference >>> >>>> >> and >>> >>>> >> >> >>> >>> dependency >>> >>>> >> >> >>> >>> >> >> trees. >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> Anyways I am confident that one can implement a >>> first >>> >>>> >> >> prototype >>> >>>> >> >> >>> by >>> >>>> >> >> >>> >>> >> >> only using Sentences and POS tags and - if >>> available >>> >>>> - >>> >>>> >> Chunks >>> >>>> >> >> >>> (e.g. >>> >>>> >> >> >>> >>> >> >> Noun phrases). >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> > I assume that in the Stanbol context, a feature >>> like >>> >>>> >> Relation >>> >>>> >> >> >>> >>> extraction >>> >>>> >> >> >>> >>> >> > would be implemented as an EnhancementEngine? >>> >>>> >> >> >>> >>> >> > What kind of effort would be required for a >>> >>>> co-reference >>> >>>> >> >> >>> resolution >>> >>>> >> >> >>> >>> tool >>> >>>> >> >> >>> >>> >> > integration into Stanbol? >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> Yes in the end it would be an EnhancementEngine. But >>> >>>> before >>> >>>> >> we >>> >>>> >> >> can >>> >>>> >> >> >>> >>> >> build such an engine we would need to >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * extend the Stanbol NLP processing API with >>> >>>> Annotations for >>> >>>> >> >> >>> >>> co-reference >>> >>>> >> >> >>> >>> >> * add support for JSON Serialisation/Parsing for >>> those >>> >>>> >> >> annotation >>> >>>> >> >> >>> so >>> >>>> >> >> >>> >>> >> that the RESTful NLP Analysis Service can provide >>> >>>> >> co-reference >>> >>>> >> >> >>> >>> >> information >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> > At this moment I'll be focusing on 2 aspects: >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > 1. Determine the best data structure to >>> encapsulate >>> >>>> the >>> >>>> >> >> extracted >>> >>>> >> >> >>> >>> >> > information. I'll take a closer look at Dolce. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> Don't make to to complex. Defining a proper >>> structure to >>> >>>> >> >> represent >>> >>>> >> >> >>> >>> >> Events will only pay-off if we can also successfully >>> >>>> extract >>> >>>> >> >> such >>> >>>> >> >> >>> >>> >> information form processed texts. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> I would start with >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * fise:SettingAnnotation >>> >>>> >> >> >>> >>> >> * {fise:Enhancement} metadata >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * fise:ParticipantAnnotation >>> >>>> >> >> >>> >>> >> * {fise:Enhancement} metadata >>> >>>> >> >> >>> >>> >> * fise:inSetting {settingAnnotation} >>> >>>> >> >> >>> >>> >> * fise:hasMention {textAnnotation} >>> >>>> >> >> >>> >>> >> * fise:suggestion {entityAnnotation} (multiple >>> if >>> >>>> there >>> >>>> >> are >>> >>>> >> >> >>> more >>> >>>> >> >> >>> >>> >> suggestions) >>> >>>> >> >> >>> >>> >> * dc:type one of fise:Agent, fise:Patient, >>> >>>> >> fise:Instrument, >>> >>>> >> >> >>> >>> fise:Cause >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * fise:OccurrentAnnotation >>> >>>> >> >> >>> >>> >> * {fise:Enhancement} metadata >>> >>>> >> >> >>> >>> >> * fise:inSetting {settingAnnotation} >>> >>>> >> >> >>> >>> >> * fise:hasMention {textAnnotation} >>> >>>> >> >> >>> >>> >> * dc:type set to fise:Activity >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> If it turns out that we can extract more, we can add >>> >>>> more >>> >>>> >> >> >>> structure to >>> >>>> >> >> >>> >>> >> those annotations. We might also think about using >>> an >>> >>>> own >>> >>>> >> >> namespace >>> >>>> >> >> >>> >>> >> for those extensions to the annotation structure. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> > 2. Determine how should all of this be integrated >>> into >>> >>>> >> >> Stanbol. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> Just create an EventExtractionEngine and configure a >>> >>>> >> enhancement >>> >>>> >> >> >>> chain >>> >>>> >> >> >>> >>> >> that does NLP processing and EntityLinking. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> You should have a look at >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * SentimentSummarizationEngine [1] as it does a lot >>> of >>> >>>> things >>> >>>> >> >> with >>> >>>> >> >> >>> NLP >>> >>>> >> >> >>> >>> >> processing results (e.g. connecting adjectives (via >>> >>>> verbs) to >>> >>>> >> >> >>> >>> >> nouns/pronouns. So as long we can not use explicit >>> >>>> dependency >>> >>>> >> >> trees >>> >>>> >> >> >>> >>> >> you code will need to do similar things with Nouns, >>> >>>> Pronouns >>> >>>> >> and >>> >>>> >> >> >>> >>> >> Verbs. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> * Disambigutation-MLT engine, as it creates a Java >>> >>>> >> >> representation >>> >>>> >> >> >>> of >>> >>>> >> >> >>> >>> >> present fise:TextAnnotation and >>> fise:EntityAnnotation >>> >>>> [2]. >>> >>>> >> >> >>> Something >>> >>>> >> >> >>> >>> >> similar will also be required by the >>> >>>> EventExtractionEngine >>> >>>> >> for >>> >>>> >> >> fast >>> >>>> >> >> >>> >>> >> access to such annotations while iterating over the >>> >>>> >> Sentences of >>> >>>> >> >> >>> the >>> >>>> >> >> >>> >>> >> text. >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> best >>> >>>> >> >> >>> >>> >> Rupert >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> [1] >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>>> >> >>> >>>> >>> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/sentiment-summarization/src/main/java/org/apache/stanbol/enhancer/engines/sentiment/summarize/SentimentSummarizationEngine.java >>> >>>> >> >> >>> >>> >> [2] >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>>> >> >>> >>>> >>> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/disambiguation-mlt/src/main/java/org/apache/stanbol/enhancer/engine/disambiguation/mlt/DisambiguationData.java >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > Thanks >>> >>>> >> >> >>> >>> >> > >>> >>>> >> >> >>> >>> >> > Hope this helps to bootstrap this discussion >>> >>>> >> >> >>> >>> >> >> best >>> >>>> >> >> >>> >>> >> >> Rupert >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >> -- >>> >>>> >> >> >>> >>> >> >> | Rupert Westenthaler >>> >>>> >> >> rupert.westentha...@gmail.com >>> >>>> >> >> >>> >>> >> >> | Bodenlehenstraße 11 >>> >>>> >> >> >>> ++43-699-11108907 >>> >>>> >> >> >>> >>> >> >> | A-5500 Bischofshofen >>> >>>> >> >> >>> >>> >> >> >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >> -- >>> >>>> >> >> >>> >>> >> | Rupert Westenthaler >>> >>>> >> rupert.westentha...@gmail.com >>> >>>> >> >> >>> >>> >> | Bodenlehenstraße 11 >>> >>>> >> >> >>> ++43-699-11108907 >>> >>>> >> >> >>> >>> >> | A-5500 Bischofshofen >>> >>>> >> >> >>> >>> >> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >>> -- >>> >>>> >> >> >>> >>> | Rupert Westenthaler >>> >>>> rupert.westentha...@gmail.com >>> >>>> >> >> >>> >>> | Bodenlehenstraße 11 >>> >>>> >> >> ++43-699-11108907 >>> >>>> >> >> >>> >>> | A-5500 Bischofshofen >>> >>>> >> >> >>> >>> >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> -- >>> >>>> >> >> >>> | Rupert Westenthaler >>> >>>> rupert.westentha...@gmail.com >>> >>>> >> >> >>> | Bodenlehenstraße 11 >>> >>>> ++43-699-11108907 >>> >>>> >> >> >>> | A-5500 Bischofshofen >>> >>>> >> >> >>> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> -- >>> >>>> >> >> | Rupert Westenthaler >>> rupert.westentha...@gmail.com >>> >>>> >> >> | Bodenlehenstraße 11 >>> >>>> ++43-699-11108907 >>> >>>> >> >> | A-5500 Bischofshofen >>> >>>> >> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> -- >>> >>>> >> | Rupert Westenthaler rupert.westentha...@gmail.com >>> >>>> >> | Bodenlehenstraße 11 >>> ++43-699-11108907 >>> >>>> >> | A-5500 Bischofshofen >>> >>>> >> >>> >>>> >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> | Rupert Westenthaler rupert.westentha...@gmail.com >>> >>>> | Bodenlehenstraße 11 ++43-699-11108907 >>> >>>> | A-5500 Bischofshofen >>> >>>> >>> >>> >>> >>> >>> >> >>> >>> >>> >>> -- >>> | Rupert Westenthaler rupert.westentha...@gmail.com >>> | Bodenlehenstraße 11 ++43-699-11108907 >>> | A-5500 Bischofshofen >>> >> >> -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen