Author: rwesten
Date: Fri Jun 1 07:02:34 2012
New Revision: 1344995
URL: http://svn.apache.org/viewvc?rev=1344995&view=rev
Log:
added to new figures, updated 'Occurrence based Annotation' section
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
(with props)
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
(with props)
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.mdtext
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.mdtext?rev=1344995&r1=1344994&r2=1344995&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/contentitemfactory.mdtext
Fri Jun 1 07:02:34 2012
@@ -75,7 +75,7 @@ The following implementations of this in
* StreamSource: A ContentSource wrapping an InputStream. Multiple calls to
#getStream() are not be supported and will cause IllegalStateExceptions. Calls
to #getData() will load the contents of the stream to an in memory.
* ByteArraySource: A ContentSource implementation that uses a byte array to
store represent the content. All constructors take the byte array representing
the content as parameter. Calls to #getData() MUST NOT copy the byte array to
avoid duplications.
-* StringSource: A ContentSource implementation that directly allows to parse a
String instance. The constructors convert the passed String to an byte array by
using the passed Charset. UTF-8 is used as default. This implementation is
based on the ByteArraySource.
+* StringSource: A ContentSource implementation that directly allows to parse a
String instance. The constructors convert the passed String to an byte array by
using the passed Charset. UTF-8 is used as default. This implementation is
based on the ByteArraySource.
### ContentReference
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext?rev=1344995&r1=1344994&r2=1344995&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
Fri Jun 1 07:02:34 2012
@@ -153,12 +153,16 @@ To implement user interfaces like that o
### Visualise Occurrences of extracted features
-The occurrence of extracted features are represented by instances of the
concept 'fise:TextAnnotation'. However not all TextAnnotations are of interest
for this use case as they are also used for other things (e.g. annotating the
language of the parsed content).
+The occurrence of extracted features are represented by instances of the
concept 'fise:TextAnnotation'. The next figure shows how TextAnnotations
describe the occurrence of an recognized feature in the parsed text.
+
+
+
+Applications that want to visualize extracted features will need to
follow/implement the following steps:
Typically the following steps are required to correctly show extracted
features within the content.
1. Query for/iterate over 'fise:TextAnnotation's of the enhancement results.
- * it is important to only use TextAnnotations that define a
'fise:selected-text' property. TextAnnotations that do not define this property
usually select whole sections or even the document as a whole. Those are not of
interest for this use case.
+ * it is important to only use TextAnnotations that define a
'fise:selected-text' property. TextAnnotations that do not define this property
usually select whole sections or even the document as a whole. While such
TextAnnotations are important (e.g. for annotating the language of the Text)
they are of no interest for this use case and need therefore to be ignored.
2. Determine the exact occurrence of the TextAnnoations
* in case of plain text content this can be easily done by using the
values of 'fise:start' and 'fise:end'
* in case the content includes additional markup the char indexes of
'fise:start'/'fise:end' will not match. In such cases the preferred way is to
first search the occurrence of'fise:selection-context' and thann the occurrence
of 'fise:selected-text' within.
@@ -182,14 +186,17 @@ The following SPARQL query could be used
}
}
-Additionally:
+_Tips and Tricks:_
-* The value of the 'dc:type' is well suited to select different style sheets.
See the section for [fise:TextAnnotation](#fisetextannotation) for detailed
information.
-* Note hat one
+* Applications that want to differentiate between different types of extracted
entities (e.g. applying different stylesheets for persons, organizations and
places) can use the value of the 'dc:type' for that purpose. See the section
for [fise:TextAnnotation](#fisetextannotation) for detailed information.
### Interact with suggested Entities
-In principle there are three different cases
+This section explains how Users mitt want to interact with extracted/suggested
Entities. Extracted Entities are represented by 'fise:EntityAnnotation's. Those
EntityAnnotations are linked with the TextAnnotations (occurrences) and the
Entity of the used Knowledge base. The following figure shows an Example for an
EntityAnnotation that suggests the Entity
['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) for the
TextAnnotation used in the example of the previous section.
+
+ for the
TextAnnotation")
+
+The main purpose of EntityAnnotations is to suggest Entities (e.g.
['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) for mentions
within natural languages texts. While the above Example (to keep it simple)
shows only a single suggestion in practice one need to distinguish between
three different cases - that also imply different interaction needs for users:
1. __No suggestion__: This indicates that a Named Entity was recognized during
natural language processing, but to matching Entity was found within the
knowledge base. In this case users might want to
* manually search the knowledge base for an Entity. The Stanbol Entityhub
Sites Endpoint can be used to implement this feature by sending a "GET
http://{host}:{port}/entityhub/sites/find?name={name}" (see the WebUI of your
Stanbol instance for the detailed documentation).
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png?rev=1344995&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entityannotation.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png?rev=1344995&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_textannotation.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream