svn commit: r819826 - in /websites/staging/stanbol/trunk/content: ./ stanbol/docs/trunk/enhancer/

buildbot Fri, 01 Jun 2012 00:03:26 -0700

Author: buildbot
Date: Fri Jun  1 07:02:53 2012
New Revision: 819826

Log:
Staging update by buildbot for stanbol


Added:
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
   (with props)
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
   (with props)
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.html
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Jun  1 07:02:53 2012
@@ -1 +1 @@
-1344659
+1344995

Modified: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.html
 Fri Jun  1 07:02:53 2012
@@ -139,7 +139,7 @@
 <ul>
 <li>StreamSource: A ContentSource wrapping an InputStream. Multiple calls to 
#getStream() are not be supported and will cause IllegalStateExceptions. Calls 
to #getData() will load the contents of the stream to an in memory.</li>
 <li>ByteArraySource: A ContentSource implementation that uses a byte array to 
store represent the content. All constructors take the byte array representing 
the content as parameter. Calls to #getData() MUST NOT copy the byte array to 
avoid duplications.</li>
-<li>StringSource: A ContentSource implementation that directly allows to parse 
a String instance. The constructors convert the passed String to an byte array 
by using the passed Charset. UTF-8 is used as default. This implementation is 
based on the ByteArraySource.</li>
+<li>StringSource: A ContentSource implementation that directly allows to parse 
a String instance. The constructors convert the passed String to an byte array 
by using the passed Charset. UTF-8 is used as default. This implementation is 
based on the ByteArraySource. </li>
 </ul>
 <h3 id="contentreference">ContentReference</h3>
 <p>This interface allows to describe content that is not yet locally 
available. The Stanbol Enhancer will dereference the content when automatically 
when needed.</p>

Modified: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.html
 Fri Jun  1 07:02:53 2012
@@ -78,210 +78,7 @@
   
   <div id="content">
     <h1 class="title">Stanbol Enhancement Structure</h1>
-    <p>This document specifies the Structure used by the Stanbol Enhancer 
encodes features extracted form the parsed <a 
href="contentitem.html">ContentItem</a>. The Enhancement Structure is based on 
<a href="http://www.w3.org/TR/rdf-primer/";>RDF</a> technology and defined as <a 
href="http://www.w3.org/2004/OWL/";>OWL</a> ontology. </p>
-<p>Its two main purposes are to facilitate the:</p>
-<ol>
-<li>Interoperability between EnhancementEngines: The design of the Stanbol 
Enhancer is based on the processing of an <a 
href="contentitem.html">ContentItem</a> by multiple <a 
href="engines">EnhancementEngine</a>s in an <a 
href="chains">EnhancementChain</a>. Together with the ContentItem API the 
EnhancementStructure is the key enabler for the cooperation of the different 
engines. It ensures that enhancements created by one engine can be consumed by 
the following engines (e.g. the first engine detects the language of the parsed 
text; the second consumes the language to select the correct NER (named entity 
recognition) model and create enhancements describing Named Entities contained 
in the text; the third Engine consumes those Named Entity annotations and 
creates suggestions for Entities part of an controlled vocabulary).</li>
-<li>
-<p>Consumption of extracted Features: The knowledge structure standardized by 
this Ontology aims to allow users to consume/process the features extracted 
from the parsed content. This includes things like:</p>
-<ul>
-<li>list all suggested Entities (accept/reject Tags)</li>
-<li>list all suggested Topics (content classification)</li>
-<li>group Entity suggestion based on detected "Named Entities" (disambiguation 
support)</li>
-<li>show the occurrence of detected Entities within the analyzed text (similar 
to spell checker UIs)</li>
-</ul>
-<p>The last section of this document provides a more detailed look at those 
usage scenarios.</p>
-</li>
-</ol>
-<p>This document follows a similar structure. While the first part goes into 
the detail of the Stanbol Enhancement Structure as integral part of the Stanbol 
Enhancer the second part focuses on the consumption of the 
EnhancementResults.</p>
-<p>While the first part is intended to be read by Developers that want to 
extend the Stanbol Enhancer (e.g. implement there own <a 
href="engines">EnhancementEngine</a>s) the target audience of the second part 
are typical users of the Stanbol Enhancer.</p>
-<h1 id="part-1-the-stanbol-enhancement-structure">PART 1: The Stanbol 
Enhancement Structure</h1>
-<p>The Stanbol Enhancement Structure is a central part of the <a 
href="index.html">Stanbol Enhancer</a> architecture as it represents the 
binding element between the <a href="contentitem.html">ContentItem</a> analyzed 
by the the <a href="engines">EnhancementEngine</a>s as configured by an <a 
href="chains">EnhancementChain</a>. Together with the <a href="content 
item.html#content-parts">ContentParts</a> it represents the state that is 
constantly updated during the enhancement process.</p>
-<p>The following graphic provides an overview on how the EnhancementStructure 
is used by the Stanbol Enhancer to formally represent the enhancement 
results.</p>
-<p><img alt="EnhancementStructure Overview" src="enhancementstructure.png" 
title="Overview of the Stanbol Enhancement Structure showing 'Bob Marley' 
recognized as Person within the parsed Text with two suggested Entities 'Bob 
Marley' the musician and 'Bob Marley' the comedian" /></p>
-<p>The above figure shows </p>
-<ul>
-<li>A <a href="contentitem.html">ContentItem</a> with a single plain text <a 
href="content item.html#content-parts">ContentParts</a> containing the text 
"Apache Stanbol can detect famous entities such as Paris or Bob Marley!"</li>
-<li>Three Enhancements: One TextAnnotation describing "Bob Marley" as 
Named-Entity as extracted by the NER (NamedEntityRecognition) engine and two 
EntityAnnotation that suggest different Entities from <a 
href="http://dbpedia.org";>DBpedia.org</a>.</li>
-<li>Two referenced Entities: Both <a 
href="http://dbpedia.org/resource/Bob-Marley";>dbpedia:Bob_Marley</a> and <a 
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29";>dbpedia:Bob_Marley_(comedian)</a>
 are part of <a href="http://dbpedia.org";>DBpedia.org</a> and referenced by 
fise:EntityAnnotations created by instance of the the <a 
href="engines/namedentitytaggingengine.html">NamedEntityLinging engine</a> 
configured to link with <a href="http://dbpedia.org";>DBpedia.org</a></li>
-<li>An <a href="chains">EnhancementChain</a> with four <a 
href="engines">EnhancementEngine</a>s. However only the enhancements of the 
later two are shown in the figure.</li>
-</ul>
-<p>The bold relations within the figure are central as they show how the 
EnhancementStructure is used to formally specify that the mention "Bob Marley" 
within the analyzed text is believed to represent the Entity <a 
href="http://dbpedia.org/resource/Bob-Marley";>dbpedia:Bob_Marley</a>. However 
it is also stated that there is a disambiguation with an other person <a 
href="http://dbpedia.org/resource/Bob_Marley_%28comedian%29";>dbpedia:Bob_Marley_(comedian)</a>.</p>
-<p>The dashed relations are also important as they are used to formally 
describe the extraction context: which EnhancementEngine has extracted a 
feature from what ContentItem. If even more contextual information are needed, 
users can combine those information with the <a 
href="executionmetadata.html">ExecutionMetadata</a> collected during the 
enhancement process.</p>
-<h2 id="general-information">General Information</h2>
-<p><strong>Used Namespaces</strong></p>
-<p>This provides the list of namespaces used/referenced by the Enhancement 
Structure</p>
-<ul>
-<li><strong>fise</strong> (<em>http://fise.iks-project.eu/ontology/</em>): 
This is the main namespace of the currently used Enhancement Structure. All 
custom concepts and properties are defined using this namespace. (*)</li>
-<li><strong>enhancer</strong> 
(<em>http://stanbol.apache.org/ontology/enhancer/enhancer#</em>): This is the 
main namespace of the Stanbol Enhancer defining concepts such as ContentItem, 
EnhancementEngine, EnhancementChain â¦</li>
-<li>
-<dl>
-<dt><strong>entityhub</strong> 
(<em>http://stanbol.apache.org/ontology/entityhub/entityhub#</em>)</dt>
-<dd>This is the main namespace of the Stanbol Entityhub component. </dd>
-</dl>
-</li>
-<li><strong>dc</strong> (<em>http://purl.org/dc/terms/</em>): The Dublin Core 
terms standard is also heavily used by the Stanbol Enhancement Structure. 
Especially to encode metada data, but also to encode relations between 
extracted information (fise:Enhancement's)</li>
-<li><strong>dppedia-ont</strong> (<em>http://dbpedia.org/ontology/</em>): 
Concepts of this Ontology are used to describe the types of "Named Entities" 
detected in parsed content.</li>
-<li><strong>skos</strong> (<em>http://www.w3.org/2004/02/skos/core#</em>): The 
SKOS standard is preferable used to describe entries of Thesauri or more 
generally any type of controlled vocabularies.</li>
-<li><strong>rdf</strong> 
(<em>http://www.w3.org/1999/02/22-rdf-syntax-ns#</em>)</li>
-<li>in addition <a href="engines">EnhancementEngine</a>s are free to add/use 
properties of any additional Ontology (e.g. when adding the rdf:type's of 
suggested Entities).</li>
-</ul>
-<p><em>(*) Historical side note: FISE was the name of the Stanbol Enhancer 
before its <a href="TODO: add link">incubation to Apache</a>. The Enhancement 
Structure does still use the original namespace for compatibility 
reasons.</em></p>
-<p><strong>About Expressiveness:</strong></p>
-<p>All Stanbol Ontologies are encoded using OWL but restrict itself to basic 
features. Users need to be aware that not all rules defined in this 
documentation are formally expressed within the Ontology. However all the 
stated rules are validated by the <a 
href="http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/generic/test/src/main/java/org/apache/stanbol/enhancer/test/helper/EnhancementStructureHelper.java";>EnhancementStructureHelper</a>
 UnitTest utility part of the "org.apache.stanbol.enhancer.test" module. This 
ensures that EnhancementEngine implementation that validate there enhancement 
using this utility comply to this specification.</p>
-<p><strong>About Reasoning:</strong></p>
-<p>Apache Stanbol assumes the users will have no reasoning support. Because of 
that EnhancementEngines are required to materialize information that would be 
otherwise only available by reasoning (e.g. it is required that they add both 
"fise:TextAnnotation" and "fise:Enhancement" as "rdf:type"s when writing a 
TextAnnotation).</p>
-<h2 id="core-concepts">Core Concepts</h2>
-<p>The main concept of the Stanbol Enhancement Structure is the 
"fise:Enhancement". It is used as base concept for all annotation types and 
defines the generic properties every enhancement MUST provide (e.g. creator, 
creation date, extracted-from, confidence). On top of the "fise:Enhancement" 
three specific annotations types are defined:</p>
-<ul>
-<li>TextAnnotation: To describe features with there occurrence within the 
parsed Text</li>
-<li>EntityAnnotation: To suggest (linked) Entities with features detected 
within the content</li>
-<li>TopicAnnotation: To classify (link) the parsed content along topics</li>
-</ul>
-<h3 id="fiseenhancement">fise:Enhancement</h3>
-<p>Every feature extracted by an <a href="engines">EnhancementEngine</a> that 
is expressed using the Stanbol Enhancement Structure needs to be represented as 
a RDF resource with the "rdf:type" "fise:Enhancement".</p>
-<p>Enhancements use <a 
href="http://dublincore.org/documents/dcmi-terms/";>Dublin Core terms</a> to 
provide metadata about their creation:</p>
-<ul>
-<li><strong>dc:creator</strong> <em>(required, single)</em>: The <a 
href="engines">EnhancementEngine</a> that created the Enhancement. Currently 
the full qualified name of the Java Class implementing the engine is used as 
String values. In future version this will change to the relative URL of the 
EnhancementEngine (e.g. "/enhancer/engine/{engine-name}")</li>
-<li><strong>dc:created</strong> <em>(required, single)</em>: The UTF date/time 
when the enhancement was created by the EnhancementEngine.</li>
-<li><strong>dc:contributor</strong> <em>(optional, multiple)</em>: Additional 
<a href="engines">EnhancementEngine</a> that contributed to the 
Enhancement.</li>
-<li><strong>dc:modified</strong> <em>(optional, single)</em>: The last change 
to a given enhancement.</li>
-</ul>
-<p>The following properties provide information about the enhancement</p>
-<ul>
-<li><strong>fise:extracted-from</strong> <em>(required, single)</em>: The URI 
of the "enhancer:ContentItem" the feature was extracted. EnhancementEngines 
need to use the UriRef returned by ContentItem#getUri() as value.</li>
-<li><strong>fise:confidence</strong> <em>(optional, single, range: 0 &lt;= 
confidence &lt;= 1)</em>: The confidence of the enhancement as floating point 
number. NOTE that while this uses a floating point number as value users should 
not treat values to be on a rational scale - meaning that an enhancement with a 
confidence of 0.4 is NOT half as good as one with 0.8!</li>
-<li><strong>dc:relation</strong> <em>(optional, multiple)</em>: Specifies that 
the current fise:Enhancement has a relation to an other fise:Enhancement. 
Values need to be resources of the "rdf:type" "fise:Enhancement".</li>
-<li><strong>dc:requires</strong> <em>(optional, multiple)</em>: Specifies that 
the current fise:Enhancement depends on an other fise:Enhancement. This is a 
stronger version of using "dc:relation" and should indicate that if one of the 
required enhancements is declined/removed this also affects this one. Values 
need to be resources of the "rdf:type" "fise:Enhancement". NOTE also that 
Dublin Core terms defines dc:requires as an sub-property of dc:relation.</li>
-</ul>
-<h3 id="fisetextannotation">fise:TextAnnotation</h3>
-<p>TextAnnotations are used to select portions parsed textual content by using 
the following properties:</p>
-<ul>
-<li><strong>fise:start</strong> <em>(optional, single)</em>: The start 
character position within the plain text version of the parsed content. Note 
that the plain text version can be retrieved by using the <a 
href="enhancerrest.thml#multi-part-contentitem-support">multi-part content item 
support</a> of the Stanbol Enhancer RESTful API.</li>
-<li><strong>fise:end</strong> <em>(required of fise:start is present, 
single)</em>: The end character position. This MUST only be present of 
"fise:start" is also defined.</li>
-<li><strong>fise:selected-text</strong> <em>(optional, single)</em>: The text 
selected by the TextAnnotation. This MUST be the same as the text from index 
"fise:start" to "fise:end" within the plain text version of the parsed 
content.</li>
-<li><strong>fise:selection-context</strong> <em>(required if 
fise:selected-text is present, single)</em>: The selection context such as the 
current sentence or a fixed number of characters/word before and after the 
selected text. This MUST be present if "fise:selected-text" is defined.</li>
-<li><strong>dc:type</strong> <em>(optional,single)</em>: The nature of the 
selected part of the text (e.g. dbpedia-ont:Person, Organization, 
dbpedia-ont:Place for Named Entities; dc:LinguisticSystem for language 
annotations; skos:Concept for abstract things incl. categorizations). Note that 
dc:type values are just recommendations. Users are free to use different as the 
recommended one. As an example the <a 
href="engines/keywordlinkingengine.html">KeywordLinkingEngine</a> allows users 
to configure dc:type mappings.</li>
-</ul>
-<p>As hinted by the description of the above properties their usage depends on 
the size of the selected part of the text.</p>
-<ul>
-<li>selection of the whole Document: This is the default and MUST BE assumed 
if non of the start/end/selected-text/selection-context properties is 
present</li>
-<li>selection of a part (e.g. chapter, sentence): The preferred way is to 
define start/end positions. selected-text and selection-context are inefficient 
for bigger section as they would duplicate those sections of the content with 
the RDF graph as literals.</li>
-<li>Selection of words, word-phrases: In this case it is highly recommended to 
define start/end as well as selected-text/selection-context. Especially the 
selected-text and selection-context are important to calculate the exact 
position of an enhancement in non-plain-text content (e.g. HTML fragments).</li>
-</ul>
-<p>NOTE: In future version TextAnnotations might switch to a Model that 
uses</p>
-<ul>
-<li>fise:selection-prefix: some words/characters before the selected 
section.</li>
-<li>fise:selection-head: the first few word/characters of a the selected 
section within the text. Alternative to fise:selected-text in case bigger 
sections of the parsed content need to be selected.</li>
-<li>fise:selection-tail: the last few words/characters of a selected section. 
To be used together with fise:selection-head.</li>
-<li>fise:selection-suffix: some words/characters after the selected 
section.</li>
-</ul>
-<h3 id="fiseentityannotation">fise:EntityAnnotation</h3>
-<p>EntityAnnotations are used to suggest/link entities recognized within the 
Text. While fise:TextAnnotations are used for representing the recognition(s) 
(occurrence(s) within the content) the EntityAnnotation provides information 
about the referenced Entity.</p>
-<ul>
-<li><strong>fise:entity-reference</strong> <em>(required, single)</em>: The 
URI of the referenced entity. In cases several URIs are defined as equal (e.g. 
by "owl:sameAs") EnhancementEngines need to choose one of the URIs and include 
the according "owl:sameAs" in the enhancement results</li>
-<li><strong>fise:entity-label</strong> <em>(required, single)</em>: The label 
of the linked entity. While entities may define multiple labels (e.g. for 
different languages, alternate/preferred â¦) EnhancementEngines are required 
to only include a single - the best fitting - label.</li>
-<li><strong>fise:entity-type</strong> <em>(optional, multiple)</em>: The types 
of the linked entity. Usually this is the list of rdf:types. However there 
might be situations where other Resources are used as types. </li>
-<li><strong>dc:relation</strong> <em>(required, multiple)</em>: The 
dc:relation property is required for entity annotations. Typically values are 
"fise:TextAnnotation"s this EntityAnnotation is a suggestion for.</li>
-<li><strong>entityhub:site</strong> (optional, single)_: The name of the 
Entityhub ReferencedSite managing the the suggested Entity. If this property is 
present users can dereference the suggested Entity with a GET request to 
"{stanbol}/entityhub/site/{site-name}/entity?id={entity}" where {site-name} is 
the value of this property and {entity} is the value of the 
"fise:entity-reference" property. 
-    NOTE: the values "local" and "entityhub" need to be treated separately. In 
those cases the GET request need to use 
"{stanbol}/entityhub/entity?id={entity}".</li>
-</ul>
-<h3 id="fisetopicannotation">fise:TopicAnnotation</h3>
-<p>TopicAnnotation are used to categorize/classify the parsed content along 
some categorization system. This is done by suggesting/linking Topics of that 
categorization system for (possible parts) of the parsed content. A 
"fise:TextAnnotation" is used to select the part of the content where the 
linked topics apply.</p>
-<ul>
-<li><strong>fise:entity-reference</strong> <em>(required, single)</em>: The 
URI of the topic.</li>
-<li><strong>fise:entity-label</strong> <em>(required, single)</em>: The human 
readable label of the topic. While topics may define multiple labels (e.g. for 
different languages) EnhancementEngines are required to only include a single - 
the best fitting - label.</li>
-<li><strong>fise:entity-type</strong> <em>(optional, multiple)</em>: It is 
best practice to use <a href="http://www.w3.org/2004/02/skos/";>SKOS</a> for 
modeling hierarchical classification systems. If this recommendation is 
followed than the value of fise:entity-type will be "skos:Concept". However 
users are free to also use different types with "fise:TopicAnnotation"s. </li>
-<li><strong>dc:relation</strong> <em>(required, multiple)</em>: The 
dc:relation property is required for topic annotations. It refers to the 
fise:TextAnnotation specifying the part of the text this topic is applied 
to.</li>
-<li><strong>entityhub:site</strong> (optional, single)_: The name of the 
Entityhub ReferencedSite managing the the suggested Entity. If this property is 
present users can dereference the suggested Entity with a GET request to 
"{stanbol}/entityhub/site/{site-name}/entity?id={entity}" where {site-name} is 
the value of this property and {entity} is the value of the 
"fise:entity-reference" property. 
-    NOTE: the values "local" and "entityhub" need to be treated separately. In 
those cases the GET request need to use 
"{stanbol}/entityhub/entity?id={entity}".</li>
-</ul>
-<h1 id="part-2-using-the-stanbol-enhancement-structure">Part 2: Using the 
Stanbol Enhancement Structure</h1>
-<h2 id="entity-tagging">Entity Tagging</h2>
-<p>TODO: Work in progress</p>
-<h2 id="entity-disambiguation">Entity Disambiguation</h2>
-<p>TODO: Work in progress</p>
-<h2 id="occurrence-based-annotation">Occurrence based Annotation</h2>
-<p>This describes a user interface similar to one of a spell/grammar checker. 
But instead of marking misspelled words entities recognized within the text are 
suggested to the user. The following figure shows such an interface as 
implemented by the <a href="http://hallojs.org";>hallo.js</a> combined with the 
<a href="https://github.com/szabyg/annotate.js";>annotate.js</a> plugin (see the 
<a href="http://hallojs.org/annotate.html";>Demo here</a> <small>(last accessed 
2012-05-30)</small> - click in the Text and press the "annotate" button).</p>
-<p><img alt="Occurrence based Annotation UI" 
src="hallo-annotate_scrrenshot.png" title="hallo.js with the annotate.js plugin 
used to implement an text occurrence based annotation UI" /></p>
-<p>To implement user interfaces like that one needs to (1) show occurrences of 
extracted features within the text and (2) let the user interact with suggested 
entities.</p>
-<h3 id="visualise-occurrences-of-extracted-features">Visualise Occurrences of 
extracted features</h3>
-<p>The occurrence of extracted features are represented by instances of the 
concept 'fise:TextAnnotation'. However not all TextAnnotations are of interest 
for this use case as they are also used for other things (e.g. annotating the 
language of the parsed content).</p>
-<p>Typically the following steps are required to correctly show extracted 
features within the content.</p>
-<ol>
-<li>Query for/iterate over 'fise:TextAnnotation's of the enhancement 
results.<ul>
-<li>it is important to only use TextAnnotations that define a 
'fise:selected-text' property. TextAnnotations that do not define this property 
usually select whole sections or even the document as a whole. Those are not of 
interest for this use case.</li>
-</ul>
-</li>
-<li>Determine the exact occurrence of the TextAnnoations<ul>
-<li>in case of plain text content this can be easily done by using the values 
of 'fise:start' and 'fise:end'</li>
-<li>in case the content includes additional markup the char indexes of 
'fise:start'/'fise:end' will not match. In such cases the preferred way is to 
first search the occurrence of'fise:selection-context' and thann the occurrence 
of 'fise:selected-text' within.</li>
-</ul>
-</li>
-<li>Retrieve the suggestions ('fise:TextAnnoation' instances) for a given 
TextAnnotation. For that one needs to search for "?suggestion dc:relation 
{text-annotation}" where '{text-annotation}' refers to the URI of the current 
TextAnnotation. Note that:<ul>
-<li>Not every TextAnnotation will have suggestions</li>
-<li>One and the same suggestion might be linked with several 
TextAnnotations.</li>
-</ul>
-</li>
-</ol>
-<p>The following SPARQL query could be used to select all the required 
information. However the use of SPARQL is optional as the required information 
can be also easily retrieved by other means (e.g. the filtered Iteratros as 
typically provided by RDF frameworks). </p>
-<div class="codehilite"><pre><span class="nb">select</span> <span 
class="o">*</span> 
-<span class="n">from</span> <span class="p">{</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">rdfs:type</span> <span class="n">fise:TextAnnotation</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">fise:selected</span><span class="o">-</span><span 
class="n">text</span> <span class="p">?</span><span class="n">selected</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">fise:selection</span><span class="o">-</span><span 
class="n">context</span> <span class="p">?</span><span class="n">context</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">fise:start</span> <span class="p">?</span><span 
class="n">startIndex</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">fise:end</span> <span class="p">?</span><span 
class="n">endIndex</span>
-    <span class="p">?</span><span class="n">textAnnotation</span> <span 
class="n">dc:type</span> <span class="p">?</span><span class="n">nature</span>
-    <span class="n">optional</span> <span class="p">{</span> 
-        <span class="p">?</span><span class="n">suggestions</span> <span 
class="n">dc:relation</span> <span class="p">?</span><span 
class="n">textAnnotation</span> 
-    <span class="p">}</span>
-<span class="p">}</span>
-</pre></div>
-
-
-<p>Additionally:</p>
-<ul>
-<li>The value of the 'dc:type' is well suited to select different style 
sheets. See the section for <a 
href="#fisetextannotation">fise:TextAnnotation</a> for detailed 
information.</li>
-<li>Note hat one </li>
-</ul>
-<h3 id="interact-with-suggested-entities">Interact with suggested Entities</h3>
-<p>In principle there are three different cases</p>
-<ol>
-<li><strong>No suggestion</strong>: This indicates that a Named Entity was 
recognized during natural language processing, but to matching Entity was found 
within the knowledge base. In this case users might want to<ul>
-<li>manually search the knowledge base for an Entity. The Stanbol Entityhub 
Sites Endpoint can be used to implement this feature by sending a "GET 
http://{host}:{port}/entityhub/sites/find?name={name}"; (see the WebUI of your 
Stanbol instance for the detailed documentation).</li>
-<li>Create a new Entity based on the current TextAnnotation. In this case the 
'fise:selected-text' should be suggested as 'rdfs:label' and the 'dc:type' 
value could be used for the 'rdf:type'. New Entities can be added to the 
knowledge base by sending a "POST http://{host}:{port}/entityhub/entity"; with 
the RDF data of the Entity as content (see the WebUI of your Stanbol instance 
for the detailed documentation).</li>
-</ul>
-</li>
-<li><strong>Distinct suggestion</strong>: This means that there is only a 
single suggestion with a high 'fise:confidence'. Also multiple suggestions 
where the first one as a high confidence and additional suggestions come with 
low confidence values may fit this description. In such situations <ul>
-<li>the UI might want to automatically accept the suggestion</li>
-<li>allow users to show additional suggestion on request.</li>
-<li>undo automatic acceptance of the suggestion.</li>
-</ul>
-</li>
-<li><strong>Ambiguous Suggestions</strong>: This situation is satisfied if 
multiple entities are suggested with a medium to high 'fise:confidence'. This 
also applies to situations where there is no suggestion with an high 
'fise:confidence' value. In those cases typically the user must provide 
additional input by<ul>
-<li>selecting the correct entity</li>
-<li>rejecting all suggestions</li>
-<li>also manually searching and/or creating a new Entity as described for (1) 
would be possible interaction</li>
-</ul>
-</li>
-</ol>
-<p>The required data for for the described interaction patters are available 
within the enhancement results as follows:</p>
-<p>The following assumes {text-annotation} - the URI of the current 
'fise:TextAnnotation' - as context</p>
-<ol>
-<li>Query for/iterate over all entity suggestions: The suggestions for 
{text-annotation} can be acquired by using "?entityAnnotation dc:relation 
{text-annotation}<ul>
-<li>only results with the the 'rdf:type' 'fise:EntityAnnotation' should be 
processed. However typically all results will be any way of that type.</li>
-<li>the 'fise:confidence' property represents the confidence of the suggestion 
in the range FROM 0 (very uncertain) TO 1 (very certain). Note that the 
'fise:confidence' value is optional - so there might be EntityAnnotations 
without confidence information. However all <a 
href="engines/list.html">EnhancementEngines managed by the Stanbol 
community</a> do provide confidence information.</li>
-</ul>
-</li>
-<li>Visualize suggestions: EntityAnnotations do provide some basic information 
about the suggested Entity that can be used for visualization. Most important 
the URI of the suggested entity as value of 'fise:referenced-entity'. 
Additional the label and the types of the Entity are included.</li>
-<li>Retrieving additional information about referenced Entities: While the 
EntityAnnotation includes some basic information some users might want to 
retrieve all available information of referenced Entities - to dereference the 
Entity:<ul>
-<li>As this is a rather common use case the <a href="">EntityLinkingEngine</a> 
and <a href="">KeywordLinkingEngine</a> are by default configured to include 
information of Entities within the EnhancementResults. So users that use those 
EnhancementEngines will not need to dereference Entities as those information 
are already available within the enhancement results.</li>
-<li>If a 'fise:EntityAnnotation' has the 'entityhub:site' property Entities 
can be dereferenced by using the Stanbol Entityhub (see the section for <a 
href="#fiseentityannotation">fise:EntityAnnotation</a> for details)</li>
-<li>In all other cases the URI of the suggested entity need to be used for 
dereferencing. If the referenced Entity is part of the <a 
href="http://linkeddata.org/";>Linked Data</a> cloud this is often possible by 
the <a href="http://www.w3.org/TR/cooluris/";>CoolURI</a> - basically sending a 
"GET -h "Accept: application/json+rdf" {entity-uri}".</li>
-</ul>
-</li>
-</ol>
+    
   </div>
   
   <div id="footer">

Added: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
==============================================================================
Binary file - no diff available.

Propchange: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
==============================================================================
Binary file - no diff available.

Propchange: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

svn commit: r819826 - in /websites/staging/stanbol/trunk/content: ./ stanbol/docs/trunk/enhancer/

Reply via email to