Author: rwesten
Date: Mon Nov 28 07:20:49 2011
New Revision: 1206998
URL: http://svn.apache.org/viewvc?rev=1206998&view=rev
Log:
some additions (TextSelecor, more to AnnotationSets
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.mdtext
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.mdtext?rev=1206998&r1=1206997&r2=1206998&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/ses_annotationontology.mdtext
Mon Nov 28 07:20:49 2011
@@ -1,11 +1,6 @@
Title: The Stanbol Enhancement Structure (PROPOSAL)
-Please NOTE: This is a proposal for the future version of the Enhancement
Structure used by the Stanbol Enhancer.
-
-**NOTES:**
-
-* This **DOES NOT** describe the Enhancement Structure used by the current
version of the Stanbol Enhancer!
-* There is also an [older Proposal](stanbolenhancementstructure.html) that
might still contain some information that are not yet contained in this version.
+Please NOTE: This is a proposal for the future version of the Enhancement
Structure used by the Stanbol Enhancer. This **DOES NOT** describe the
Enhancement Structure used by the current version of the Stanbol Enhancer!
## Background
@@ -85,9 +80,11 @@ With the Annotation-Ontology each Select
#### Text Selectors
-The "PrefixPostfixSelector" as defined by the Text-Annotation Ontology differs
from the currently used FISE Text Annotation. It does not define the character
indexes and uses prefix and postfix instead of the surrounding context.
+The currently used FISE TextAnnotation differs form text selects of the
Annotation-Ontology mainly in that, that it defines bot the actual annotation
AND the selection within the text. Therefore when adopting the "Anootation ->
Seletor" model or the Annotation-Ontology all Annotation related properties of
the FISE TextAnnotation must be separated from the properties describing the
selection.
+
+The Annotation-Ontology defines two text selectors: (1) the
"OffsetRangeSelector" that uses char offset within the text to define a
selection and (2) the "PrefixPostfixSelector" that uses a prefix, suffix and
the selected text to define the selection based on the context. The Stanbol
Enhancer currently uses both (context and offset) to define selection. However
currently only single property "context" is used instead of the prefix, suffix
model of the "PrefixPostfixSelector". In general the prefix, postfix based
context definition as used by the Annotation-Ontology is better, because is
allows to uniquely determine the selected part of the text even if the selected
text appears multiple times within a given context. With the currently used
model it is not possible to do that if the selected text appears several times
in the provided context.
-Regarding backward compatibility The suggestion is to adopt the
"PrefixPostfixSelector" but keep the start and end positions of the current
Text Annotation. The prefix/posfix model of the "PrefixPostfixSelector" is
definitely better than the used context of the FISE Text Annotation, because it
allows to clearly identify the selected text even if it occurs several times in
a given context.
+The suggestion is to keep both (offset and context) based definition of text
selection but switch to the prefix, suffix model for defining the context .
Therefore stanbol:TextSelector will be defined as sub-class of both
"OffsetRangeSelector" and "PrefixPostfixSelector".
#### Multi Media Selectors and the Media Fragments Standard
@@ -115,10 +112,17 @@ Text Annotations are Annotations as typi
The text selection can be expressed by using an "PrefixPostfixSelector". The
type and the confidence of the detected named entity need to be properties of
the Annotation class.
+ stanbol:TextAnnotation rdfs:subClassOf ao:Annotation
+ stanbol:TextAnnotation stanbol:named-entity-type {schema:Perosn,
schema:Organization, schema:Place, â¦}
+
+
#### Entity Annotations
Entity Annotations are similar to "Qualifier" annotations as defined to the
Annotaiton-Ontology. The *ao:hasTopic* relation is used to link the annotation
with the related topic.
+ stanbol:EntityAnnotation rdfs:subClassof aot:Qualifier, ao:Annotation
+
+
#### Category Anotations
Category Annotations are typically about the whole or an specific section of
an Document. Normal Selectors can be used for defining the categorized Section.
If no Selector is present the categorization applies to the whole document. The
"Qualifier" annotation could also be used as a base class for categorizations.
@@ -146,12 +150,22 @@ With the FISE Enhancement Structure this
Expressing the same based on the Annotation-Ontology would be possible by
* An Annotation Set that links to the following Annotations (by the *ao:item*
property):
-* An TextAnnotaion including the PrefixPostfixSelector selector defining the
actual position of the selected text within the document
+* An TextAnnotaion uses an stanbol:TextSelector to define the actual selected
position of the selected text within the document
* One EntityAnnotation (extends ao:Qualifier) per suggested Entities.
* In addition the Annotation Set also includes metadata such the the Engine
that created the suggestions
+**OPTIONS**
+
+* Allow multiple TextAnnotations: This would allow to suggest the same set of
Entities to all TextAnnotations. However it would make it also more difficult
to express if a user would except an suggestion for on TextAnnotation but
reject the same for an other. In addition Users might even accept different
suggestions for different included TextAnnotation. (see also *Coreference
Suggestions*)
+
#### Category Suggestions
Typically categorizations can provide more than a single Category. So grouping
such suggestions within an AnnotationSet gives Users the possibility to
accept/reject one or more of such suggestions. In addition it would also allow
to distinguish sets of categorizations calculated based on disjoint sets of
categories (e.g. a categorization based on a UserProfile with a categorization
based on general topics or a spatial categorization.)
+#### Coreference Suggestion
+
+This would allow to link several Text Annotations to suggest a co-reference
between those two. This kind of AnnotationSet is expected to be used by NLP
(Natural Language Processing) frameworks that can detect co-references. It
might be also of interest for Engines that suggest Entities but keep an
Annotation Context and therefore want to link persons only referred by the
given or family name to an other occurrence that uses both.
+
+The type of the coreference could be captured by an special property of this
annotation set type.
+