Author: buildbot
Date: Wed Jul 4 07:35:18 2012
New Revision: 824416
Log:
Staging update by buildbot for stanbol
Modified:
websites/staging/stanbol/trunk/content/ (props changed)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jul 4 07:35:18 2012
@@ -1 +1 @@
-1357117
+1357123
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
(original)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancementusage.html
Wed Jul 4 07:35:18 2012
@@ -82,7 +82,7 @@
</div>
<div id="content">
<h1 class="title">Making use of Apache Stanbol Enhancements</h1>
- <p>This document describes how to implement client side, i.e. user
interface components by using the <a
href="enhancer/engines/enhancementstructure.html">enhancement results</a>
returned by the <a href="enhancer">Apachee Stanbol Enhancer</a>. It does so by
using three different scenarios:</p>
+ <p>This document describes how to implement client side, i.e. user
interface components by using the <a
href="enhancer/enhancementstructure.html">enhancement results</a> returned by
the <a href="enhancer">Apachee Stanbol Enhancer</a>. It does so by using three
different scenarios:</p>
<ul>
<li><strong>Entity Tagging</strong> - replacing text based tags such as "Bob
Marley" with entities - <a
href="dbpedia.org/resource/Bob_Marley">dbpedia:Bob_Marley</a> - to improve
content search and categorization. As added value this can also be used for
mashups with already available information about linked entities and search
engine optimization by <a
href="http://schema.org/docs/datamodel.html">including metadata</a> of tagged
entities within the content.</li>
<li><strong>Entity Disambiguation</strong> - enhance the entity tagging
experience by explicit support for disambiguation between different suggested
entities. This allows users to explicitly link to Paris (Texas), Bob Marley
(Comedian) or in between any other entities that do share similar labels.</li>
@@ -93,7 +93,7 @@
<p>Entity tagging is about suggesting user defined entities instead of strings
to tag their documents. The difference is very easy to explain. Let's assume a
blogger that uses the tag "Bob Marley" to tag a blog entry. Tagging is all
about structuring content. By tagging it with "Bob Marley" he can easily find
all documents that uses that tag. However, most likely he would also want to
create a category of documents about reggae music and most likely he would like
that documents tagged with "Bob Marley" are part of that group. </p>
<p>While the knowledge that "Bob Marley" is related to reggae music might be
obvious for the blogger as a person it can not be known by the blogging tool
she uses. Typically the only way to solve this is that the blogger tags the
document with both tags.</p>
<p>Entity tagging tries to work around that by linking documents with entities
defined by a knowledge base. The fact that Bob Marley is related to reggae
music is nothing novel. <a href="http://dbpedia.org">DBpedia</a>, the Wikipedia
database, does know that and a lot more about the entity <a
href="dbpedia.org/resource/Bob_Marley">dbpedia:Bob_Marley</a>. If the blogger
tags her document with "dbpedia:Bob_Marley", she does not only tag it with "Bob
Marley" but also with all the other contextual information provided by DBPedia
- including the fact that Bob Marley was a reggae interpret.</p>
-<p>But this does not only work with famous people, big cities, etc. Nowadays
the Web <a href="http://linked-data.org">links data</a> of different domains.
However, this is not only about the Web - it works even better if you use
entities relevant to yourself and/or your working environment (products,
articles, customers, etc).</p>
+<p>But this does not only work with famous people, big cities, etc. Nowadays
the Web <a href="http://linkeddata.org">links data</a> of different domains.
However, this is not only about the Web - it works even better if you use
entities relevant to yourself and/or your working environment (products,
articles, customers, etc).</p>
<h3 id="suggest-entities-with-the-apache-stanbol-enhancer">Suggest entities
with the Apache Stanbol Enhancer</h3>
<p>Requesting the Apache Stanbol Enhancer to analyze a text requires to send a
POST request as defined by the <a href="enhancer/enhancerrest.html">RESTful
API</a>.</p>
<div class="codehilite"><pre><span class="n">curl</span> <span
class="o">-</span><span class="n">X</span> <span class="n">POST</span> <span
class="o">-</span><span class="n">H</span> <span class="s">"Accept:
application/rdf+xml"</span> <span class="o">-</span><span
class="n">H</span> <span class="s">"Content-type: text/plain"</span>
<span class="o">\</span>
@@ -125,25 +125,25 @@
<li>Dereferencing suggested entities: If the suggested entity is available via
the Apache Stanbol Entityhub the {entity-anntotation} does have the
'entityhub:site' property. The value of this property is the name of the
referenced site of the Entityhub. To dereference the entity a GET request to
"{stanbol-root-URL}/entityhub/site/{site-name}/entity?id={entity}" need to be
used. The "Accept" header of the request need to be set to the according RDF
serialization (e.g. "application/rdf+json").</li>
</ol>
<h3 id="process-content-categorizations">Process Content Categorizations</h3>
-<p>'<a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
instances are used to formally represent categories assigned to the parsed
Content. The main difference between extracted entities and assigned categories
is that extracted entities do have one or more explicit mentions within the
text while assigned categories are suggested based on the document as a whole -
typically they are not explicitly mentioned in the text.</p>
+<p><a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>
instances are used to formally represent categories assigned to the parsed
Content. The main difference between extracted entities and assigned categories
is that extracted entities do have one or more explicit mentions within the
text while assigned categories are suggested based on the document as a whole -
typically they are not explicitly mentioned in the text.</p>
<p>Typically, an entity tagging UI will want to distinguish between categories
and entities because:</p>
<ul>
<li>categories are used to group content (e.g. blog posts about work and
private things)</li>
<li>entities are used to search/suggest blog posts about specific topics (e.g.
a blog about some feature implemented for "Apache Solr", a nice event in the
"Sternbräu" in "Salzburg")</li>
</ul>
-<p>The usage of '<a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>'
is similar to an EntityAnnotation. Both annotation types use the exact same
properties ('fise:entity-referene','fise:entity-label',fise:entity-type',
'fise:confidence','entityhub:site'). The only difference is that one need to
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically
clients will want to use the exact same code to process {entity-annotation} and
{topic-annotation} instances.</p>
+<p>The usage of <a
href="enhancer/enhancementstructure.html#fisetopicannotation">fise:TopicAnnotation</a>
is similar to an EntityAnnotation. Both annotation types use the exact same
properties ('fise:entity-referene','fise:entity-label',fise:entity-type',
'fise:confidence','entityhub:site'). The only difference is that one need to
iterate over '{topic-annotation} rdf:type fise:TopicAnnotaion'. So typically
clients will want to use the exact same code to process {entity-annotation} and
{topic-annotation} instances.</p>
<p>In the next section we will describe an improved version of entity tagging
is described that allows users to: (1) accept/decline a spotted entity and than
(2) select one of several suggested entities.</p>
<h2 id="entity-tagging-with-disambiguation-support">Entity tagging with
disambiguation support</h2>
<p>Entity disambiguation is required if an entity detected in the analyzed
text can refer to different entities. The following figure shows an example
where "Bob Marley" is detected as a person in the text however there are two
possible matches within the controlled vocabulary.</p>
<p><img alt="Entity Disambiguation" src="enhancer/es_entitydisambiguation.png"
title=""Bob Marley as spotted in the text may refer to two different
persons in DBpedia.org" /></p>
-<p>The fact that one entity detected in the text - represented by a '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
may have multiple suggested entities - represented by the two
'fise:EntityAnnotation's - has a negative impact on <a
href="#entity-tagging">entity tagging</a> interface that suggest tags based on
'fise:entityAnnotation's. This is because such an interface would show in the
above case two suggestions: (1) for <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2)
for <a
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29">dbpedia:Bob_Marley_(comedian)</a>.
So even if the user want to tag this content with "Bob Marley", she will need
to reject at least one of the two suggestions.</p>
+<p>The fact that one entity detected in the text - represented by a '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
may have multiple suggested entities - represented by the two
'fise:EntityAnnotation's - has a negative impact on <a
href="#entity-tagging-use-tags-to-relate-you-content-to-persons-places-events">entity
tagging</a> interface that suggest tags based on 'fise:entityAnnotation's.
This is because such an interface would show in the above case two suggestions:
(1) for <a
href="http://dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> and (2)
for <a
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29">dbpedia:Bob_Marley_(comedian)</a>.
So even if the user want to tag this content with "Bob Marley", she will need
to reject at least one of the two suggestions.</p>
<p>Adding explicit support for entity disambiguation to an entity tagging user
interface can solve this problem by grouping suggested entities along
'fise:TextAnnotation's they are suggested for. </p>
<h3 id="grouping-suggested-entities">Grouping suggested Entities</h3>
<p>The goal of an entity tagging UI with disambiguation support is to show
only a single tag suggestion for all entities suggested for the same section in
the analyzed text. To solve this, we need to follow the link between
'fise:EntityAnnotation' and 'fise:TextAnnotation'.</p>
<p>There are several options on how to achieve this. We present a solution
that iterates over the 'fise:EntityAnnotation's.</p>
<ol>
<li>Iterate over all '<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'
instances. This refers to all resources such as "{entity-annotation} rdf:type
fise:EntityAnnotation". <ul>
-<li>For more information on how to collect information for extracted entities
see the <a href="#process-suggested-entities">according section</a> in the <a
href="#entity-tagging">entity tagging</a> interface.</li>
+<li>For more information on how to collect information for extracted entities
see the <a href="#process-suggested-entities">according section</a> in the <a
href="#entity-tagging-use-tags-to-relate-you-content-to-persons-places-events">entity
tagging</a> interface.</li>
</ul>
</li>
<li>Retrieve the '<a
href="enhancer/enhancementstructure.html#fisetextannotation">fise:TextAnnotation</a>'
referenced by processed 'fise:EntityAnnotation's. For this, we retrieve the
value(s) of the 'dc:relation' property.</li>
@@ -152,7 +152,7 @@
</ul>
</li>
<li>Suggest tags based on 'fise:TextAnnotation's - keys in the mapping created
in step (3).<ul>
-<li>Allow users to easily accept the Entity with the highest rank - <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> in the
above example. Especially if the confidence of the first suggestion is high
(e.g. >= 0.8) and considerable higher as confidence values of other
options.</li>
+<li>Allow users to easily accept the Entity with the highest rank - <a
href="http://dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> in the
above example. Especially if the confidence of the first suggestion is high
(e.g. >= 0.8) and considerable higher as confidence values of other
options.</li>
<li>Provide users with the possibility to inspect further suggested options -
to disambiguate between different options.</li>
</ul>
</li>
@@ -212,9 +212,9 @@
<li>Applications that want to differentiate between different types of
extracted entities (e.g. applying different stylesheets for persons,
organizations and places) can use the value of the 'dc:type' for that purpose.
See the section for <a href="#fisetextannotation">fise:TextAnnotation</a> for
detailed information. </li>
</ul>
<h3 id="interact-with-suggested-entities">Interact with suggested entities</h3>
-<p>This section explains how users mitt want to interact with
extracted/suggested entities. Extracted entities are represented by '<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
Those EntityAnnotations are linked with the <a
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
(occurrences) and to the entity of the used knowledge base. The following
figure shows an example for an EntityAnnotation that suggests the entity <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the
TextAnnotation used in the example of the previous section.</p>
+<p>This section explains how users mitt want to interact with
extracted/suggested entities. Extracted entities are represented by '<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>'s.
Those EntityAnnotations are linked with the <a
href="enhancer/enhancementstructure.html#fisetextannotation">TextAnnotation</a>
(occurrences) and to the entity of the used knowledge base. The following
figure shows an example for an EntityAnnotation that suggests the entity <a
href="http://dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for the
TextAnnotation used in the example of the previous section.</p>
<p><img alt="'fise:EntityAnnotation' example"
src="enhancer/es_entityannotation.png" title="This example shown an
EntityAnnotation that suggests the entity 'dbpedia:Bob_Marley' for the
TextAnnotation" /></p>
-<p>The main purpose of <a
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
is to suggest entities (e.g. <a
href="http:dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for
mentions within natural languages texts. While the above example (to keep it
simple) shows only a single suggestion in practice one need to distinguish
between three different cases - that also imply different interaction needs for
users:</p>
+<p>The main purpose of <a
href="enhancer/enhancementstructure.html#fiseentityannotation">EntityAnnotation</a>s
is to suggest entities (e.g. <a
href="http://dbpedia.org/resource/Bob_Marley">'dbpedia:Bob_Marley'</a> for
mentions within natural languages texts. While the above example (to keep it
simple) shows only a single suggestion in practice one need to distinguish
between three different cases - that also imply different interaction needs for
users:</p>
<ol>
<li><strong>No suggestion</strong>: This indicates that a named entity was
recognized during natural language processing, but no matching entity was found
within the knowledge base. In this case users might want to<ul>
<li>manually search the knowledge base for an entity. The Apache Stanbol
Entityhub sites endpoint can be used to implement this feature by sending a
"GET http://{host}:{port}/entityhub/sites/find?name={name}" (see the WebUI of
your Stanbol instance for the detailed documentation).</li>
@@ -245,7 +245,7 @@
<li>Visualize suggestions: EntityAnnotations do provide some basic information
about the suggested entity that can be used for visualization. Most important
the URI of the suggested entity as value of 'fise:referenced-entity'.
Additionally, the label and the types of the entity are included.</li>
<li>Retrieving additional information about referenced entities: While the
EntityAnnotation includes some basic information some users might want to
retrieve all available information of referenced entities - to dereference the
entity:<ul>
<li>As this is a rather common use case the <a href="">EntityLinkingEngine</a>
and <a href="">KeywordLinkingEngine</a> are by default configured to include
information of Entities within the EnhancementResults. So users that use those
EnhancementEngines will not need to dereference Entities as those information
are already available within the enhancement results.</li>
-<li>If a 'fise:EntityAnnotation' has the 'entityhub:site' property, entities
can be dereferenced by using the Apache Stanbol Entityhub (see the section for
<a href="#fiseentityannotation">fise:EntityAnnotation</a> for details)</li>
+<li>If a 'fise:EntityAnnotation' has the 'entityhub:site' property, entities
can be dereferenced by using the Apache Stanbol Entityhub (see the section for
<a
href="enhancer/enhancementstructure.html#fiseentityannotation">fise:EntityAnnotation</a>
for details)</li>
<li>In all other cases the URI of the suggested entity need to be used for
dereferencing. If the referenced entity is part of the <a
href="http://linkeddata.org/">linked data</a> cloud, this is often possible by
the <a href="http://www.w3.org/TR/cooluris/">CoolURI</a> - basically sending a
"GET -h "Accept: application/json+rdf" {entity-uri}".</li>
</ul>
</li>