enhancementusage.html

buildbot Wed, 04 Jul 2012 00:28:04 -0700

Author: buildbot
Date: Wed Jul  4 07:27:36 2012
New Revision: 824415

Log:
Staging update by buildbot for stanbol


Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jul  4 07:27:36 2012
@@ -1 +1 @@
-1357069
+1357117

Modified: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html 
(original)
+++ 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html 
Wed Jul  4 07:27:36 2012
@@ -102,18 +102,18 @@
 </pre></div>
 
 
-<p>As response you will receive the enhancement results formatted as an RDF 
graph in a serialization format specified by the "Accept" header 
('application/rdf+xml' in the above example request). This RDF graph contains 
the information about the entities extracted from the parsed content. See the 
documentation of the Apache Stanbol <a 
href="/enhancer/enhancementstructure.html">enhancement structure</a> for 
details.</p>
+<p>As response you will receive the enhancement results formatted as an RDF 
graph in a serialization format specified by the "Accept" header 
('application/rdf+xml' in the above example request). This RDF graph contains 
the information about the entities extracted from the parsed content. See the 
documentation of the Apache Stanbol <a 
href="enhancer/enhancementstructure.html">enhancement structure</a> for 
details.</p>
 <p>The following figure shows how extracted entities are described in the 
enhancement results. 
 <img alt="'fise:EntityAnnotation' example" 
src="enhancer/es_entityannotation.png" title="This example shows an 
EntityAnnotation that suggests the entity 'dbpedia:Bob_Marley' for the 
TextAnnotation" /></p>
 <p>In principle there are two resources that are of interest for the entity 
tagging use case:</p>
 <ol>
-<li><a 
href="/enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s:
 Resources with the 'rdf:type' 'fise:EntityAnnotation' do represent the entity 
suggestions by the Apache Stanbol Enhancer. This resources provide the label, 
type and most important the URI of the extracted entity. In addition the value 
of the fise:confidence' [0..1] can be used as indication how certain the Apache 
Stanbol Enhancer is about this entity. </li>
+<li><a 
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s:
 Resources with the 'rdf:type' 'fise:EntityAnnotation' do represent the entity 
suggestions by the Apache Stanbol Enhancer. This resources provide the label, 
type and most important the URI of the extracted entity. In addition the value 
of the fise:confidence' [0..1] can be used as indication how certain the Apache 
Stanbol Enhancer is about this entity. </li>
 <li>Entities: This refers to all resources with an incoming 
'fise:entity-reference' relation (such as 'dbpedia:Bob_Marley' in the above 
example). Enhancement engines can be configured to "dereference" suggested 
entities - meaning to use the URI of the entity to retrieve additional 
information. In this case, additional information about suggested entities will 
be available in the enhancement results. If this in not the case, users will 
need to dereference suggested entities themselves.</li>
 </ol>
 <h3 id="process-suggested-entities">Process Suggested Entities</h3>
 <p>The following steps are typically needed to acquire the information needed 
to implement an entity tagging user interface:</p>
 <ol>
-<li>Iterate over all suggested entities: These are all resources such as 
"{entity-annotation} rdf:type <a 
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>"</li>
+<li>Iterate over all suggested entities: These are all resources such as 
"{entity-annotation} rdf:type <a 
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>"</li>
 <li>Basic information: Those are available directly via the 
{entity-annotation} to ensure their availability even if the {entity} itself in 
not not included - dereferenced - in the enhancement results.<ul>
 <li>URI of the suggested entity: {entity-annotation} fise:entity-reference 
{entity}</li>
 <li>Label: The value of the fise:entity-label is typically the label via that 
the entity was recognized in the analyzed content. Additional labels are 
typically available via the {entity}</li>
@@ -125,28 +125,28 @@
 <li>Dereferencing suggested entities: If the suggested entity is available via 
the Apache Stanbol Entityhub the {entity-anntotation} does have the 
'entityhub:site' property. The value of this property is the name of the 
referenced site of the Entityhub. To dereference the entity a GET request to 
"{stanbol-root-URL}/entityhub/site/{site-name}/entity?id={entity}" need to be 
used. The "Accept" header of the request need to be set to the according RDF 
serialization (e.g. "application/rdf+json").</li>
 </ol>
 <h3 id="process-content-categorizations">Process Content Categorizations</h3>
-<p>'<a 
href="/enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
 instances are used to formally represent categories assigned to the parsed 
Content. The main difference between extracted entities and assigned categories 
is that extracted entities do have one or more explicit mentions within the 
text while assigned categories are suggested based on the document as a whole - 
typically they are not explicitly mentioned in the text.</p>
+<p>'<a 
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
 instances are used to formally represent categories assigned to the parsed 
Content. The main difference between extracted entities and assigned categories 
is that extracted entities do have one or more explicit mentions within the 
text while assigned categories are suggested based on the document as a whole - 
typically they are not explicitly mentioned in the text.</p>
 <p>Typically, an entity tagging UI will want to distinguish between categories 
and entities because:</p>
 <ul>
 <li>categories are used to group content (e.g. blog posts about work and 
private things)</li>
 <li>entities are used to search/suggest blog posts about specific topics (e.g. 
a blog about some feature implemented for "Apache Solr", a nice event in the 
"SternbrÃ¤u" in "Salzburg")</li>
 </ul>
-<p>The usage of '<a 
href="/enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
 is similar to an EntityAnnotation. Both annotation types use the exact same 
properties ('fise:entity-referene','fise:entity-label',fise:entity-type', 
'fise:confidence','entityhub:site'). The only difference is that one need to 
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically 
clients will want to use the exact same code to process {entity-annotation} and 
{topic-annotation} instances.</p>
+<p>The usage of '<a 
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
 is similar to an EntityAnnotation. Both annotation types use the exact same 
properties ('fise:entity-referene','fise:entity-label',fise:entity-type', 
'fise:confidence','entityhub:site'). The only difference is that one need to 
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically 
clients will want to use the exact same code to process {entity-annotation} and 
{topic-annotation} instances.</p>
 <p>In the next section we will describe an improved version of entity tagging 
is described that allows users to: (1) accept/decline a spotted entity and than 
(2) select one of several suggested entities.</p>
 <h2 id="entity-tagging-with-disambiguation-support">Entity tagging with 
disambiguation support</h2>
 <p>Entity disambiguation is required if an entity detected in the analyzed 
text can refer to different entities. The following figure shows an example 
where "Bob Marley" is detected as a person in the text however there are two 
possible matches within the controlled vocabulary.</p>
 <p><img alt="Entity Disambiguation" src="enhancer/es_entitydisambiguation.png" 
title="&quot;Bob Marley as spotted in the text may refer to two different 
persons in DBpedia.org" /></p>
-<p>The fact that one entity detected in the text - represented by a '<a 
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 may have multiple suggested entities - represented by the two 
'fise:EntityAnnotation's - has a negative impact on <a 
href="#entity-tagging">entity tagging</a> interface that suggest tags based on 
'fise:entityAnnotation's. This is because such an interface would show in the 
above case two suggestions: (1) for <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2) 
for <a 
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29";>dbpedia:Bob_Marley_(comedian)</a>.
 So even if the user want to tag this content with "Bob Marley", she will need 
to reject at least one of the two suggestions.</p>
+<p>The fact that one entity detected in the text - represented by a '<a 
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 may have multiple suggested entities - represented by the two 
'fise:EntityAnnotation's - has a negative impact on <a 
href="#entity-tagging">entity tagging</a> interface that suggest tags based on 
'fise:entityAnnotation's. This is because such an interface would show in the 
above case two suggestions: (1) for <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2) 
for <a 
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29";>dbpedia:Bob_Marley_(comedian)</a>.
 So even if the user want to tag this content with "Bob Marley", she will need 
to reject at least one of the two suggestions.</p>
 <p>Adding explicit support for entity disambiguation to an entity tagging user 
interface can solve this problem by grouping suggested entities along 
'fise:TextAnnotation's they are suggested for. </p>
 <h3 id="grouping-suggested-entities">Grouping suggested Entities</h3>
 <p>The goal of an entity tagging UI with disambiguation support is to show 
only a single tag suggestion for all entities suggested for the same section in 
the analyzed text. To solve this, we need to follow the link between 
'fise:EntityAnnotation' and 'fise:TextAnnotation'.</p>
 <p>There are several options on how to achieve this. We present a solution 
that iterates over the 'fise:EntityAnnotation's.</p>
 <ol>
-<li>Iterate over all '<a 
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'
 instances. This refers to all resources such as "{entity-annotation} rdf:type 
fise:EntityAnnotation". <ul>
+<li>Iterate over all '<a 
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'
 instances. This refers to all resources such as "{entity-annotation} rdf:type 
fise:EntityAnnotation". <ul>
 <li>For more information on how to collect information for extracted entities 
see the <a href="#process-suggested-entities">according section</a> in the <a 
href="#entity-tagging">entity tagging</a> interface.</li>
 </ul>
 </li>
-<li>Retrieve the '<a 
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 referenced by processed 'fise:EntityAnnotation's. For this, we retrieve the 
value(s) of the 'dc:relation' property.</li>
+<li>Retrieve the '<a 
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 referenced by processed 'fise:EntityAnnotation's. For this, we retrieve the 
value(s) of the 'dc:relation' property.</li>
 <li>While iterating over the 'fise:EntityAnnotation's establish a mapping 
'fise:TextAnnotation' -&gt; 'fise:EntityAnnotation','fise:EntityAnnotation, 
...<ul>
 <li>the list of 'fise:EntityAnnotation's for each 'fise:TextAnnotation' needs 
to be sorted based on the value of the 'fise:confidence' property of the 
EntityAnnotation. Ensure that the EntityAnnotation with the higher confidence 
is first in the list. 'fise:confidence' values are in the range 0..1 where 
higher numbers represent a higher certainly.</li>
 </ul>
@@ -160,7 +160,7 @@
 <h3 id="showing-the-extraction-context">Showing the extraction context</h3>
 <p>To allow users to more easily disambiguate between the suggested entities 
it is important to provide them with information about the extraction context 
of the suggested entities. This is of special importance if content is not 
completely visible to the user (e.g. because it is to long to fit on the screen 
or the content is of a type that can not be rendered within the browser).</p>
 <p>Assuming the suggested entities are grouped by 'fise:TextAnnotation' - as 
explained in the above section - one can use the information provided by the 
TextAnnotation to visualize the context and therefore helping the user 
performing the disambiguation task.</p>
-<p>The following information of the <a 
href="/enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
 can be used for this task:</p>
+<p>The following information of the <a 
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a> 
can be used for this task:</p>
 <ul>
 <li>'fise:selection-context': This is the text surrounding the extracted 
entity. The exact size of this context depends on the configuration and the 
enhancement engine. Typically it is the current sentence or about 50 charters 
before an after the selection.</li>
 <li>'fise:selected-text': This is the text representing the extracted entity - 
the section of the text the entity was suggested for. The 'fise:selected-text' 
MUST BE contained within the 'fise:selection-context' so user interfaces to 
want to highlight the selected part of the context can use a contains query in 
the selection context for the selected text. In case of multiple matches it is 
typically sufficient to highlight all occurrences.</li>
@@ -171,12 +171,12 @@
 <p><img alt="Occurrence based Annotation UI" 
src="enhancer/hallo-annotate_scrrenshot.png" title="hallo.js with the 
annotate.js plugin used to implement an text occurrence based annotation UI" 
/></p>
 <p>To implement user interfaces like that one needs to (1) show occurrences of 
extracted features within the text and (2) let the user interact with suggested 
entities.</p>
 <h3 id="visualise-occurrences-of-extracted-features">Visualise occurrences of 
extracted features</h3>
-<p>The occurrence of extracted features are represented by instances of the 
concept '<a 
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'.
 The next figure shows how TextAnnotations describe the occurrence of an 
recognized feature in the parsed text.</p>
+<p>The occurrence of extracted features are represented by instances of the 
concept '<a 
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'.
 The next figure shows how TextAnnotations describe the occurrence of an 
recognized feature in the parsed text.</p>
 <p><img alt="'fise:TextAnnotation'" src="enhancer/es_textannotation.png" 
title="This figure shows a TextAnnotation describing the occurrence of 
&quot;Bob Marley&quot; located from character 59 to 69 in the given text" /></p>
 <p>Applications that want to visualize extracted features will need to 
follow/implement the following steps:</p>
 <p>Typically the following steps are required to correctly show extracted 
features within the content.</p>
 <ol>
-<li>Query for/iterate over '<a 
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'s
 of the enhancement results.<ul>
+<li>Query for/iterate over '<a 
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'s
 of the enhancement results.<ul>
 <li>it is important to only use TextAnnotations that define a 
'fise:selected-text' property. TextAnnotations that do not define this property 
usually select whole sections or even the document as a whole. While such 
TextAnnotations are important (e.g. for annotating the language of the Text) 
they are of no interest for this use case and need therefore to be ignored.</li>
 </ul>
 </li>
@@ -212,9 +212,9 @@
 <li>Applications that want to differentiate between different types of 
extracted entities (e.g. applying different stylesheets for persons, 
organizations and places) can use the value of the 'dc:type' for that purpose. 
See the section for <a href="#fisetextannotation">fise:TextAnnotation</a> for 
detailed information. </li>
 </ul>
 <h3 id="interact-with-suggested-entities">Interact with suggested entities</h3>
-<p>This section explains how users mitt want to interact with 
extracted/suggested entities. Extracted entities are represented by '<a 
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
 Those EntityAnnotations are linked with the <a 
href="/enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
 (occurrences) and to the entity of the used knowledge base. The following 
figure shows an example for an EntityAnnotation that suggests the entity <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the 
TextAnnotation used in the example of the previous section.</p>
+<p>This section explains how users mitt want to interact with 
extracted/suggested entities. Extracted entities are represented by '<a 
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
 Those EntityAnnotations are linked with the <a 
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a> 
(occurrences) and to the entity of the used knowledge base. The following 
figure shows an example for an EntityAnnotation that suggests the entity <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the 
TextAnnotation used in the example of the previous section.</p>
 <p><img alt="'fise:EntityAnnotation' example" 
src="enhancer/es_entityannotation.png" title="This example shown an 
EntityAnnotation that suggests the entity 'dbpedia:Bob_Marley' for the 
TextAnnotation" /></p>
-<p>The main purpose of <a 
href="/enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
 is to suggest entities (e.g. <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for 
mentions within natural languages texts. While the above example (to keep it 
simple) shows only a single suggestion in practice one need to distinguish 
between three different cases - that also imply different interaction needs for 
users:</p>
+<p>The main purpose of <a 
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
 is to suggest entities (e.g. <a 
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for 
mentions within natural languages texts. While the above example (to keep it 
simple) shows only a single suggestion in practice one need to distinguish 
between three different cases - that also imply different interaction needs for 
users:</p>
 <ol>
 <li><strong>No suggestion</strong>: This indicates that a named entity was 
recognized during natural language processing, but no matching entity was found 
within the knowledge base. In this case users might want to<ul>
 <li>manually search the knowledge base for an entity. The Apache Stanbol 
Entityhub sites endpoint can be used to implement this feature by sending a 
"GET http://{host}:{port}/entityhub/sites/find?name={name}"; (see the WebUI of 
your Stanbol instance for the detailed documentation).</li>
@@ -235,7 +235,7 @@
 </li>
 </ol>
 <p>The required data for for the described interaction patters are available 
within the enhancement results as follows:</p>
-<p>The following assumes {text-annotation} - the URI of the current '<a 
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 - as context</p>
+<p>The following assumes {text-annotation} - the URI of the current '<a 
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
 - as context</p>
 <ol>
 <li>Query for/iterate over all entity suggestions: The suggestions for 
{text-annotation} can be acquired by using "?entityAnnotation dc:relation 
{text-annotation}<ul>
 <li>only results with the the 'rdf:type' 'fise:EntityAnnotation' should be 
processed. However, typically all results will be any way of that type.</li>

svn commit: r824415 - in /websites/staging/stanbol/trunk/content: ./ stanbol/docs/trunk/enhancementusage.html

Reply via email to