Author: buildbot
Date: Wed Jul 4 07:27:36 2012
New Revision: 824415
Log:
Staging update by buildbot for stanbol
Modified:
websites/staging/stanbol/trunk/content/ (props changed)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jul 4 07:27:36 2012
@@ -1 +1 @@
-1357069
+1357117
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
(original)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
Wed Jul 4 07:27:36 2012
@@ -102,18 +102,18 @@
</pre></div>
-<p>As response you will receive the enhancement results formatted as an RDF
graph in a serialization format specified by the "Accept" header
('application/rdf+xml' in the above example request). This RDF graph contains
the information about the entities extracted from the parsed content. See the
documentation of the Apache Stanbol <a
href="/enhancer/enhancementstructure.html">enhancement structure</a> for
details.</p>
+<p>As response you will receive the enhancement results formatted as an RDF
graph in a serialization format specified by the "Accept" header
('application/rdf+xml' in the above example request). This RDF graph contains
the information about the entities extracted from the parsed content. See the
documentation of the Apache Stanbol <a
href="enhancer/enhancementstructure.html">enhancement structure</a> for
details.</p>
<p>The following figure shows how extracted entities are described in the
enhancement results.
<img alt="'fise:EntityAnnotation' example"
src="enhancer/es_entityannotation.png" title="This example shows an
EntityAnnotation that suggests the entity 'dbpedia:Bob_Marley' for the
TextAnnotation" /></p>
<p>In principle there are two resources that are of interest for the entity
tagging use case:</p>
<ol>
-<li><a
href="/enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s:
Resources with the 'rdf:type' 'fise:EntityAnnotation' do represent the entity
suggestions by the Apache Stanbol Enhancer. This resources provide the label,
type and most important the URI of the extracted entity. In addition the value
of the fise:confidence' [0..1] can be used as indication how certain the Apache
Stanbol Enhancer is about this entity. </li>
+<li><a
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s:
Resources with the 'rdf:type' 'fise:EntityAnnotation' do represent the entity
suggestions by the Apache Stanbol Enhancer. This resources provide the label,
type and most important the URI of the extracted entity. In addition the value
of the fise:confidence' [0..1] can be used as indication how certain the Apache
Stanbol Enhancer is about this entity. </li>
<li>Entities: This refers to all resources with an incoming
'fise:entity-reference' relation (such as 'dbpedia:Bob_Marley' in the above
example). Enhancement engines can be configured to "dereference" suggested
entities - meaning to use the URI of the entity to retrieve additional
information. In this case, additional information about suggested entities will
be available in the enhancement results. If this in not the case, users will
need to dereference suggested entities themselves.</li>
</ol>
<h3 id="process-suggested-entities">Process Suggested Entities</h3>
<p>The following steps are typically needed to acquire the information needed
to implement an entity tagging user interface:</p>
<ol>
-<li>Iterate over all suggested entities: These are all resources such as
"{entity-annotation} rdf:type <a
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>"</li>
+<li>Iterate over all suggested entities: These are all resources such as
"{entity-annotation} rdf:type <a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>"</li>
<li>Basic information: Those are available directly via the
{entity-annotation} to ensure their availability even if the {entity} itself in
not not included - dereferenced - in the enhancement results.<ul>
<li>URI of the suggested entity: {entity-annotation} fise:entity-reference
{entity}</li>
<li>Label: The value of the fise:entity-label is typically the label via that
the entity was recognized in the analyzed content. Additional labels are
typically available via the {entity}</li>
@@ -125,28 +125,28 @@
<li>Dereferencing suggested entities: If the suggested entity is available via
the Apache Stanbol Entityhub the {entity-anntotation} does have the
'entityhub:site' property. The value of this property is the name of the
referenced site of the Entityhub. To dereference the entity a GET request to
"{stanbol-root-URL}/entityhub/site/{site-name}/entity?id={entity}" need to be
used. The "Accept" header of the request need to be set to the according RDF
serialization (e.g. "application/rdf+json").</li>
</ol>
<h3 id="process-content-categorizations">Process Content Categorizations</h3>
-<p>'<a
href="/enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
instances are used to formally represent categories assigned to the parsed
Content. The main difference between extracted entities and assigned categories
is that extracted entities do have one or more explicit mentions within the
text while assigned categories are suggested based on the document as a whole -
typically they are not explicitly mentioned in the text.</p>
+<p>'<a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
instances are used to formally represent categories assigned to the parsed
Content. The main difference between extracted entities and assigned categories
is that extracted entities do have one or more explicit mentions within the
text while assigned categories are suggested based on the document as a whole -
typically they are not explicitly mentioned in the text.</p>
<p>Typically, an entity tagging UI will want to distinguish between categories
and entities because:</p>
<ul>
<li>categories are used to group content (e.g. blog posts about work and
private things)</li>
<li>entities are used to search/suggest blog posts about specific topics (e.g.
a blog about some feature implemented for "Apache Solr", a nice event in the
"Sternbräu" in "Salzburg")</li>
</ul>
-<p>The usage of '<a
href="/enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
is similar to an EntityAnnotation. Both annotation types use the exact same
properties ('fise:entity-referene','fise:entity-label',fise:entity-type',
'fise:confidence','entityhub:site'). The only difference is that one need to
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically
clients will want to use the exact same code to process {entity-annotation} and
{topic-annotation} instances.</p>
+<p>The usage of '<a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
is similar to an EntityAnnotation. Both annotation types use the exact same
properties ('fise:entity-referene','fise:entity-label',fise:entity-type',
'fise:confidence','entityhub:site'). The only difference is that one need to
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically
clients will want to use the exact same code to process {entity-annotation} and
{topic-annotation} instances.</p>
<p>In the next section we will describe an improved version of entity tagging
is described that allows users to: (1) accept/decline a spotted entity and than
(2) select one of several suggested entities.</p>
<h2 id="entity-tagging-with-disambiguation-support">Entity tagging with
disambiguation support</h2>
<p>Entity disambiguation is required if an entity detected in the analyzed
text can refer to different entities. The following figure shows an example
where "Bob Marley" is detected as a person in the text however there are two
possible matches within the controlled vocabulary.</p>
<p><img alt="Entity Disambiguation" src="enhancer/es_entitydisambiguation.png"
title=""Bob Marley as spotted in the text may refer to two different
persons in DBpedia.org" /></p>
-<p>The fact that one entity detected in the text - represented by a '<a
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
may have multiple suggested entities - represented by the two
'fise:EntityAnnotation's - has a negative impact on <a
href="#entity-tagging">entity tagging</a> interface that suggest tags based on
'fise:entityAnnotation's. This is because such an interface would show in the
above case two suggestions: (1) for <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2)
for <a
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29">dbpedia:Bob_Marley_(comedian)</a>.
So even if the user want to tag this content with "Bob Marley", she will need
to reject at least one of the two suggestions.</p>
+<p>The fact that one entity detected in the text - represented by a '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
may have multiple suggested entities - represented by the two
'fise:EntityAnnotation's - has a negative impact on <a
href="#entity-tagging">entity tagging</a> interface that suggest tags based on
'fise:entityAnnotation's. This is because such an interface would show in the
above case two suggestions: (1) for <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2)
for <a
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29">dbpedia:Bob_Marley_(comedian)</a>.
So even if the user want to tag this content with "Bob Marley", she will need
to reject at least one of the two suggestions.</p>
<p>Adding explicit support for entity disambiguation to an entity tagging user
interface can solve this problem by grouping suggested entities along
'fise:TextAnnotation's they are suggested for. </p>
<h3 id="grouping-suggested-entities">Grouping suggested Entities</h3>
<p>The goal of an entity tagging UI with disambiguation support is to show
only a single tag suggestion for all entities suggested for the same section in
the analyzed text. To solve this, we need to follow the link between
'fise:EntityAnnotation' and 'fise:TextAnnotation'.</p>
<p>There are several options on how to achieve this. We present a solution
that iterates over the 'fise:EntityAnnotation's.</p>
<ol>
-<li>Iterate over all '<a
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'
instances. This refers to all resources such as "{entity-annotation} rdf:type
fise:EntityAnnotation". <ul>
+<li>Iterate over all '<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'
instances. This refers to all resources such as "{entity-annotation} rdf:type
fise:EntityAnnotation". <ul>
<li>For more information on how to collect information for extracted entities
see the <a href="#process-suggested-entities">according section</a> in the <a
href="#entity-tagging">entity tagging</a> interface.</li>
</ul>
</li>
-<li>Retrieve the '<a
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
referenced by processed 'fise:EntityAnnotation's. For this, we retrieve the
value(s) of the 'dc:relation' property.</li>
+<li>Retrieve the '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
referenced by processed 'fise:EntityAnnotation's. For this, we retrieve the
value(s) of the 'dc:relation' property.</li>
<li>While iterating over the 'fise:EntityAnnotation's establish a mapping
'fise:TextAnnotation' -> 'fise:EntityAnnotation','fise:EntityAnnotation,
...<ul>
<li>the list of 'fise:EntityAnnotation's for each 'fise:TextAnnotation' needs
to be sorted based on the value of the 'fise:confidence' property of the
EntityAnnotation. Ensure that the EntityAnnotation with the higher confidence
is first in the list. 'fise:confidence' values are in the range 0..1 where
higher numbers represent a higher certainly.</li>
</ul>
@@ -160,7 +160,7 @@
<h3 id="showing-the-extraction-context">Showing the extraction context</h3>
<p>To allow users to more easily disambiguate between the suggested entities
it is important to provide them with information about the extraction context
of the suggested entities. This is of special importance if content is not
completely visible to the user (e.g. because it is to long to fit on the screen
or the content is of a type that can not be rendered within the browser).</p>
<p>Assuming the suggested entities are grouped by 'fise:TextAnnotation' - as
explained in the above section - one can use the information provided by the
TextAnnotation to visualize the context and therefore helping the user
performing the disambiguation task.</p>
-<p>The following information of the <a
href="/enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
can be used for this task:</p>
+<p>The following information of the <a
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
can be used for this task:</p>
<ul>
<li>'fise:selection-context': This is the text surrounding the extracted
entity. The exact size of this context depends on the configuration and the
enhancement engine. Typically it is the current sentence or about 50 charters
before an after the selection.</li>
<li>'fise:selected-text': This is the text representing the extracted entity -
the section of the text the entity was suggested for. The 'fise:selected-text'
MUST BE contained within the 'fise:selection-context' so user interfaces to
want to highlight the selected part of the context can use a contains query in
the selection context for the selected text. In case of multiple matches it is
typically sufficient to highlight all occurrences.</li>
@@ -171,12 +171,12 @@
<p><img alt="Occurrence based Annotation UI"
src="enhancer/hallo-annotate_scrrenshot.png" title="hallo.js with the
annotate.js plugin used to implement an text occurrence based annotation UI"
/></p>
<p>To implement user interfaces like that one needs to (1) show occurrences of
extracted features within the text and (2) let the user interact with suggested
entities.</p>
<h3 id="visualise-occurrences-of-extracted-features">Visualise occurrences of
extracted features</h3>
-<p>The occurrence of extracted features are represented by instances of the
concept '<a
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'.
The next figure shows how TextAnnotations describe the occurrence of an
recognized feature in the parsed text.</p>
+<p>The occurrence of extracted features are represented by instances of the
concept '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'.
The next figure shows how TextAnnotations describe the occurrence of an
recognized feature in the parsed text.</p>
<p><img alt="'fise:TextAnnotation'" src="enhancer/es_textannotation.png"
title="This figure shows a TextAnnotation describing the occurrence of
"Bob Marley" located from character 59 to 69 in the given text" /></p>
<p>Applications that want to visualize extracted features will need to
follow/implement the following steps:</p>
<p>Typically the following steps are required to correctly show extracted
features within the content.</p>
<ol>
-<li>Query for/iterate over '<a
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'s
of the enhancement results.<ul>
+<li>Query for/iterate over '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'s
of the enhancement results.<ul>
<li>it is important to only use TextAnnotations that define a
'fise:selected-text' property. TextAnnotations that do not define this property
usually select whole sections or even the document as a whole. While such
TextAnnotations are important (e.g. for annotating the language of the Text)
they are of no interest for this use case and need therefore to be ignored.</li>
</ul>
</li>
@@ -212,9 +212,9 @@
<li>Applications that want to differentiate between different types of
extracted entities (e.g. applying different stylesheets for persons,
organizations and places) can use the value of the 'dc:type' for that purpose.
See the section for <a href="#fisetextannotation">fise:TextAnnotation</a> for
detailed information. </li>
</ul>
<h3 id="interact-with-suggested-entities">Interact with suggested entities</h3>
-<p>This section explains how users mitt want to interact with
extracted/suggested entities. Extracted entities are represented by '<a
href="/enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
Those EntityAnnotations are linked with the <a
href="/enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
(occurrences) and to the entity of the used knowledge base. The following
figure shows an example for an EntityAnnotation that suggests the entity <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the
TextAnnotation used in the example of the previous section.</p>
+<p>This section explains how users mitt want to interact with
extracted/suggested entities. Extracted entities are represented by '<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
Those EntityAnnotations are linked with the <a
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
(occurrences) and to the entity of the used knowledge base. The following
figure shows an example for an EntityAnnotation that suggests the entity <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the
TextAnnotation used in the example of the previous section.</p>
<p><img alt="'fise:EntityAnnotation' example"
src="enhancer/es_entityannotation.png" title="This example shown an
EntityAnnotation that suggests the entity 'dbpedia:Bob_Marley' for the
TextAnnotation" /></p>
-<p>The main purpose of <a
href="/enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
is to suggest entities (e.g. <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for
mentions within natural languages texts. While the above example (to keep it
simple) shows only a single suggestion in practice one need to distinguish
between three different cases - that also imply different interaction needs for
users:</p>
+<p>The main purpose of <a
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
is to suggest entities (e.g. <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for
mentions within natural languages texts. While the above example (to keep it
simple) shows only a single suggestion in practice one need to distinguish
between three different cases - that also imply different interaction needs for
users:</p>
<ol>
<li><strong>No suggestion</strong>: This indicates that a named entity was
recognized during natural language processing, but no matching entity was found
within the knowledge base. In this case users might want to<ul>
<li>manually search the knowledge base for an entity. The Apache Stanbol
Entityhub sites endpoint can be used to implement this feature by sending a
"GET http://{host}:{port}/entityhub/sites/find?name={name}" (see the WebUI of
your Stanbol instance for the detailed documentation).</li>
@@ -235,7 +235,7 @@
</li>
</ol>
<p>The required data for for the described interaction patters are available
within the enhancement results as follows:</p>
-<p>The following assumes {text-annotation} - the URI of the current '<a
href="/enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
- as context</p>
+<p>The following assumes {text-annotation} - the URI of the current '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
- as context</p>
<ol>
<li>Query for/iterate over all entity suggestions: The suggestions for
{text-annotation} can be acquired by using "?entityAnnotation dc:relation
{text-annotation}<ul>
<li>only results with the the 'rdf:type' 'fise:EntityAnnotation' should be
processed. However, typically all results will be any way of that type.</li>