ses_annotationontology.html

buildbot Sun, 11 Dec 2011 14:15:45 -0800

Author: buildbot
Date: Sun Dec 11 22:15:14 2011
New Revision: 800097

Log:
Staging update by buildbot


Modified:
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.html

Modified: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.html
 Sun Dec 11 22:15:14 2011
@@ -83,10 +83,10 @@
 <p><img alt="Example of annotation on a whole document with AO" 
src="http://annotation-ontology.googlecode.com/svn/trunk/images/Document%20Annotation%20-%20AO%20Annotation%20Ontology%20-%20by%20Paolo%20Ciccarese.png";
 title="Example of annotation on a whole document with AO" /></p>
 <p>Image Credit: Annotation-Ontology <a 
href="http://annotation-ontology.googlecode.com/svn/trunk/images/Document%20Annotation%20-%20AO%20Annotation%20Ontology%20-%20by%20Paolo%20Ciccarese.png";>Link</a></p>
 </blockquote>
-<h2 id="stanbol_enhancement_strucutre">Stanbol Enhancement Strucutre</h2>
-<p>The following sections describe how the Stanbol Enhancement Structure can 
utilize the Annotation-Ontology to encode knowledge extracted form analyzed 
Content Items.</p>
+<h2 id="stanbol_enhancement_structure">Stanbol Enhancement Structure</h2>
+<p>The following sections describe how the Stanbol Enhancement Structure can 
utilize the Annotation-Ontology to encode knowledge extracted from analyzed 
Content Items.</p>
 <h3 id="contentitems">ContentItems</h3>
-<p>Within the FISE Enhancement Structure the enhanced ContentItems where only 
referenced by the <strong>fise:extracted-form</strong> property. There was no 
specification on how to further define properties of the ContentItem. The 
Annotation-Ontology defines a much richer vocabulary for that.</p>
+<p>Within the FISE Enhancement Structure the enhanced ContentItems where only 
referenced by the <strong>fise:extracted-from</strong> property. There was no 
specification on how to further define properties of the ContentItem. The 
Annotation-Ontology defines a much richer vocabulary for that.</p>
 <p>First an most important the Annotation-Ontology distinguished between 
the:</p>
 <ul>
 <li><strong>Annotated Document</strong>: This is the Document that is 
annotated</li>
@@ -101,18 +101,18 @@
 <p>The Content Adapter pattern was suggested to be used to convert parsed 
documents to different Content Formats such as extracting the Plain Text of 
parsed HTML or PDF documents.</p>
 <p>The possibility to distinguish between the <em>Annotated Document</em> and 
the <em>Source Document</em> nicely supports this, because while Enhancement 
Engines can state that an Annotation is about the <em>Annotated Document</em> 
they can still state the exact <em>Source Document</em> that was used for 
processing. This allows e.g. to clearly state that the indexes of a text 
selection are based on the plain text version of the <em>Annotated 
Document</em>. </p>
 <h3 id="content_selectors">Content Selectors</h3>
-<p>The FISE Enhancement Structure defined a single "Content Selector" the 
<em>FISE Text Annotation</em>. The Annotation-Ontology uses a much richer 
Structure that even provides the possibility to extensions for defining 
specific selections different content types.</p>
-<p>With the Annotation-Ontology each Selector can link to both a the 
<em>Annotated Document</em> and the <em>Source Document</em>. In the following 
an Example for an Image Selection</p>
+<p>The FISE Enhancement Structure defined a single "Content Selector" the 
<em>FISE Text Annotation</em>. The Annotation-Ontology uses a much richer 
Structure that even provides the possibility to extensions for defining 
specific selections on different content types.</p>
+<p>With the Annotation-Ontology each Selector can link to both the 
<em>Annotated Document</em> and the <em>Source Document</em>. In the following 
an Example for an Image Selection</p>
 <blockquote>
 <p><img alt="Image Selector" 
src="http://annotation-ontology.googlecode.com/svn/trunk/images/Image%20InitEndCorner%20Selector%20-%20AO%20Annotation%20Ontology%20-%20by%20Paolo%20Ciccarese.png";
 title="Image Selector Example" /></p>
 <p>Image Credits: Annotation-Ontology <a 
href="http://annotation-ontology.googlecode.com/svn/trunk/images/Image%20InitEndCorner%20Selector%20-%20AO%20Annotation%20Ontology%20-%20by%20Paolo%20Ciccarese.png";>Link</a>.</p>
 </blockquote>
 <h4 id="text_selectors">Text Selectors</h4>
-<p>The currently used FISE TextAnnotation differs form text selects of the 
Annotation-Ontology mainly in that, that it defines bot the actual annotation 
AND the selection within the text. Therefore when adopting the "Anootation 
-&gt; Seletor" model or the Annotation-Ontology all Annotation related 
properties of the FISE TextAnnotation must be separated from the properties 
describing the selection.</p>
+<p>The currently used FISE TextAnnotation differs from text selects of the 
Annotation-Ontology mainly in that it defines both the actual annotation AND 
the selection within the text. Therefore when adopting the "Annotation -&gt; 
Selector" model or the Annotation-Ontology all Annotation related properties of 
the FISE TextAnnotation must be separated from the properties describing the 
selection.</p>
 <p>The Annotation-Ontology defines two text selectors: (1) the 
"OffsetRangeSelector" that uses char offset within the text to define a 
selection and (2) the "PrefixPostfixSelector" that uses a prefix, suffix and 
the selected text to define the selection based on the context. The Stanbol 
Enhancer currently uses both (context and offset) to define selection. However 
currently only single property "context" is used instead of the prefix, suffix 
model of the "PrefixPostfixSelector". In general the prefix, postfix based 
context definition as used by the Annotation-Ontology is better, because is 
allows to uniquely determine the selected part of the text even if the selected 
text appears multiple times within a given context. With the currently used 
model it is not possible to do that if the selected text appears several times 
in the provided context. </p>
-<p>The suggestion is to keep both (offset and context) based definition of 
text selection but switch to the prefix, suffix model for defining the context 
. Therefore stanbol:TextSelector will be defined as sub-class of both 
"OffsetRangeSelector" and "PrefixPostfixSelector".</p>
+<p>The suggestion is to keep both (offset and context) based definition of 
text selection but switch to the prefix, suffix model for defining the context. 
Therefore stanbol:TextSelector will be defined as sub-class of both 
"OffsetRangeSelector" and "PrefixPostfixSelector".</p>
 <h4 id="multi_media_selectors_and_the_media_fragments_standard">Multi Media 
Selectors and the Media Fragments Standard</h4>
-<p>The <a href="http://www.w3.org/2008/WebVideo/Fragments/";>Media Fragments 
Working Group</a> of the W3C is currently working on a Recommendion on how to 
encode Fragments of Resources within so called <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/";>Media 
Fragments URIs</a>.</p>
+<p>The <a href="http://www.w3.org/2008/WebVideo/Fragments/";>Media Fragments 
Working Group</a> of the W3C is currently working on a Recommendation on how to 
encode Fragments of Resources within so called <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/";>Media 
Fragments URIs</a>.</p>
 <p>This specification defines how to encode the <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#naming-time";>Temporal</a>,
 <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#naming-space";>Spatial</a>,
 <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#naming-track";>Track</a>
 and <a 
href="http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#naming-id";>ID</a>
 dimensions within Document URIs but also defines processing rules (e.g. for 
Browsers) and the semantics.</p>
 <p>The proposal here is to use this specification for encoding selections 
within multi media files within the Annotation-Ontology. This will most likely 
require the definition of an MediaFragmentSelector as extension.</p>
 <h3 id="annotations">Annotations</h3>
@@ -123,9 +123,9 @@
 <li>The Annotation-Resource may be linked to an Selector with the 
<strong>ao:context</strong> property. If no such link is present the 
Annotation-Resource is about the whole Document. It is also possible to link 
multiple Selectors with an annotation.</li>
 <li>Each Annotation-Resource MUST BE linked to the <em>Annotated Document</em> 
by using the <strong>ao:annotatesResource</strong> property. The <em>Source 
Document</em> can be referenced by using the 
<strong>ao:onSourceDocument</strong>. It is also possible to link multiple 
Documents with an annotation.</li>
 </ul>
-<p>The following sub-sections will provide an overview how Text Annotations , 
Entity Annotations and Category Annotations as used by Stanbol can be expressed 
using the Annotation-Ontology</p>
+<p>The following sub-sections will provide an overview how Text Annotations, 
Entity Annotations and Category Annotations as used by Stanbol can be expressed 
using the Annotation-Ontology</p>
 <h4 id="text_annotations">Text Annotations</h4>
-<p>Text Annotations are Annotations as typically created by NER (Named Entity 
Recognition) engines. Such Annotations select a part of a Text and assign an 
type (Person, Organization, Place ...) to that.</p>
+<p>Text Annotations are Annotations as typically created by NER (Named Entity 
Recognition) engines. Such Annotations select a part of a Text and assign a 
type (Person, Organization, Place ...) to that.</p>
 <p>The text selection can be expressed by using an "PrefixPostfixSelector". 
The type and the confidence of the detected named entity need to be properties 
of the Annotation class.</p>
 <div class="codehilite"><pre><span class="err">stanbol:TextAnnotation</span> 
<span class="err">rdfs:subClassOf</span> <span class="err">ao:Annotation</span>
 <span class="err">stanbol:TextAnnotation</span> <span 
class="err">stanbol:named-entity-type</span> <span 
class="err">{schema:Perosn,</span> <span 
class="err">schema:Organization,</span> <span class="err">schema:Place,</span> 
<span class="err">â¦}</span>
@@ -155,16 +155,16 @@
 <p>Expressing the same based on the Annotation-Ontology would be possible 
by</p>
 <ul>
 <li>An Annotation Set that links to the following Annotations (by the 
<em>ao:item</em> property):</li>
-<li>An TextAnnotaion uses an stanbol:TextSelector to define the actual 
selected position of the selected text within the document</li>
+<li>A TextAnnotation uses a stanbol:TextSelector to define the actual selected 
position of the selected text within the document</li>
 <li>One EntityAnnotation (extends ao:Qualifier) per suggested Entities.</li>
 <li>In addition the Annotation Set also includes metadata such the the Engine 
that created the suggestions</li>
 </ul>
 <p><strong>OPTIONS</strong></p>
 <ul>
-<li>Allow multiple TextAnnotations: This would allow to suggest the same set 
of Entities to all TextAnnotations. However it would make it also more 
difficult to express if a user would except an suggestion for on TextAnnotation 
but reject the same for an other. In addition Users might even accept different 
suggestions for different included TextAnnotation. (see also <em>Coreference 
Suggestions</em>)</li>
+<li>Allow multiple TextAnnotations: This would allow to suggest the same set 
of Entities to all TextAnnotations. However it would make it also more 
difficult to express if a user would accept a suggestion for one TextAnnotation 
but reject the same for an other. In addition Users might even accept different 
suggestions for different included TextAnnotation. (see also <em>Coreference 
Suggestions</em>)</li>
 </ul>
 <h4 id="category_suggestions">Category Suggestions</h4>
-<p>Typically categorizations can provide more than a single Category. So 
grouping such suggestions within an AnnotationSet gives Users the possibility 
to accept/reject one or more of such suggestions. In addition it would also 
allow to distinguish sets of categorizations calculated based on disjoint sets 
of categories (e.g. a categorization based on a UserProfile with a 
categorization based on general topics or a spatial categorization.)</p>
+<p>Typically categorizations can provide more than a single Category. So 
grouping such suggestions within an AnnotationSet gives Users the possibility 
to accept/reject one or more of such suggestions. In addition it would also 
allow to distinguish sets of categorizations calculated based on disjoint sets 
of categories (e.g. a categorization based on a UserProfile with a 
categorization based on general topics or a spatial categorization).</p>
 <h4 id="coreference_suggestion">Coreference Suggestion</h4>
 <p>This would allow to link several Text Annotations to suggest a co-reference 
between those two. This kind of AnnotationSet is expected to be used by NLP 
(Natural Language Processing) frameworks that can detect co-references. It 
might be also of interest for Engines that suggest Entities but keep an 
Annotation Context and therefore want to link persons only referred by the 
given or family name to an other occurrence that uses both.</p>
 <p>The type of the coreference could be captured by an special property of 
this annotation set type.</p>

svn commit: r800097 - /websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.html

Reply via email to