What would a sentence like this yield, "Paris is not the city in United States" ?
On Thu, Aug 23, 2012 at 4:23 PM, kritarth anand <[email protected]>wrote: > Dear members of Stanbol community, > > I hereby would like to discuss about the next few iterations of the > Disambiguation Engine. The Disambiguation Engine, To Disambiguate Engines > few versions of Engines have been prepared. I would like to briefly > describe them below. I hope to become a permanent committer for Stanbol if > my contribution is considered after this GSOC period. I will be committing > the code versions soon. And applying patch to JIRA soon. > > 1. How disambiguation Engine problem was approached. > For certain text annotations there are might be many Entity Annotations > mapped, It was required to rank them in the order of there likelihood. > Paris is the a small city in the United States. > > a.The Paris is this sentence without disambiguation (using Dbpedia as > vocabulary). There are three entity annotations mapped 1. Paris, France , > 2. Paris, Texas 3. Paris, *Something* (The entity mapped with highest > fise:confidence is Paris, France.) > b. Now how would disambiguation by humans take place. On reading the line > an individual thinks of the context the text is referring to. Doing so he > realizes that since the text talks about Paris and also about United > States. The Paris mentioned here is More Like Paris,Texas(which is in > United States) and therefore must refer to it. > c. The approach followed in implementation takes inspiration from the > example and works in the following manner somewhat follows the pseudo code > below. > for( K: TextAnnotations) > { List EntityAnnotations =getEntityAnnotationsRelated(K); > Context=GetContextInformation(K); > > List Results=QueryMLTVocabularies(K, Context); > updateConfidences(Result,EntityAnnotations) > } > > d. My current approach to handle disambiguation involved a lot of > variations however for the purpose of simplicity I'll talk only about > differences in obtaining "Context". > > 2. The Context Procurement: > a. All Entity Context: The context would be decided on by all the > textannotations of the text. It proves to show good results for shorter > texts, but introduces lot of redundant annotations in longer ones making > context less useful > b. All link Context: The context is decided on the basis of site or > reference link associated with the text annotations, which of course can be > required to disambiguate. So it does not behave in a very good fashion > c. Selection Context: The selection context is basically contains text one > sentence prior and after the current one. Also another version worked with > Text Annotations in this region of text. > d. Vicinity Entity Context: The vicinity annotation detection measures > distance in the neighborhood of the text annotation. > > 3. Future > a. With a running POC of this Engine it can be used to create an advanced > version like the Spotlight approach or using Markov Logic Networks > discussed earlier. > -- David Riccitelli ******************************************************************************** InsideOut10 s.r.l. P.IVA: IT-11381771002 Fax: +39 0110708239 --- LinkedIn: http://it.linkedin.com/in/riccitelli Twitter: ziodave --- Layar Partner Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1> ********************************************************************************
