Author: rwesten
Date: Wed May 30 11:28:40 2012
New Revision: 1344200

URL: http://svn.apache.org/viewvc?rev=1344200&view=rev
Log:
first work on PART 2: Consuming the Stanbol Enhancement Structure

Added:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png
   (with props)
Modified:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext

Modified: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext?rev=1344200&r1=1344199&r2=1344200&view=diff
==============================================================================
--- 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
 (original)
+++ 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
 Wed May 30 11:28:40 2012
@@ -135,4 +135,78 @@ TopicAnnotation are used to categorize/c
 
 # Part 2: Using the Stanbol Enhancement Structure
 
-TODO: Work in progress
\ No newline at end of file
+## Entity Tagging
+
+TODO: Work in progress
+
+## Entity Disambiguation
+
+TODO: Work in progress
+
+## Occurrence based Annotation
+
+This describes a user interface similar to one of a spell/grammar checker. But 
instead of marking misspelled words entities recognized within the text are 
suggested to the user. The following figure shows such an interface as 
implemented by the [hallo.js](http://hallojs.org) combined with the 
[annotate.js](https://github.com/szabyg/annotate.js) plugin.
+
+![Occurrence based Annotation UI](hallo-annotate_scrrenshot.png "hallo.js with 
the annotate.js plugin used to implement an text occurrence based annotation 
UI")
+
+To implement user interfaces like that one needs to acquire the following 
information form the enhancements returned by the Stanbol Enhancer.
+
+__Showing the Occurrences within the Text__
+
+This described how to obtain the necessary information needed to visualize 
extracted Entities within the text.
+
+1. Query for/iterate over 'fise:TextAnnotation's of the enhancement results.
+    * it is important to only use TextAnnotations that define a 
'fise:selected-text' property. TextAnnotations that do not define this property 
usually select whole sections or even the document as a whole. Those are not of 
interest for this use case.
+2. Determine the exact occurrence of the TextAnnoations
+    * in case of plain text content this can be easily done by using the 
values of 'fise:start' and 'fise:end'
+    * in case the content includes additional markup the char indexes of 
'fise:start'/'fise:end' will not match. In such cases the preferred way is to 
first search the occurrence of'fise:selection-context' and thann the occurrence 
of 'fise:selected-text' within.
+3. Retrieve suggested Entities for a given TextAnnotation. For that one needs 
to search for "?suggestion dc:relation {text-annotation}" where 
'{text-annotation}' refers to the URI of the current TextAnnotation.
+    * Note that there will be TextAnnotations with no suggestions.
+
+The following SPARQL query could be used to select all the required 
information. However the use of SPARQL is optional as the required information 
can be also easily retrieved by other means (e.g. the filtered Iteratros as 
typically provided by RDF frameworks). 
+
+    :::sparql
+    select * 
+    from {
+        ?textAnnotation rdfs:type fise:TextAnnotation
+        ?textAnnotation fise:selected-text ?selected
+        ?textAnnotation fise:selection-context ?context
+        ?textAnnotation fise:start ?startIndex
+        ?textAnnotation fise:end ?endIndex
+        ?textAnnotation dc:type ?nature
+       optional { ?suggestions dc:relation ?textAnnotation }
+    }
+
+Additionally:
+
+* The value of the 'dc:type' is well suited to select different style sheets. 
See the section for [fise:TextAnnotation](#fisetextannotation) for detailed 
information.
+* The UI might need to differentiate between TextAnnotations with/without 
suggestions.
+
+__Processing Suggested Entities__
+
+In principle there are three different cases
+
+1. No suggestion: This indicates that a Named Entity was recognized during 
natural language processing, but to matching Entity was found within the 
knowledge base. In this case users might want to
+    * manually search the knowledge base for an Entity. The Stanbol Entityhub 
Sites Endpoint can be used to implement this feature by sending a "GET 
http://{host}:{port}/entityhub/sites/find?name={name}"; (see the WebUI of your 
Stanbol instance for the detailed documentation).
+    * Create a new Entity based on the current TextAnnotation. In this case 
the 'fise:selected-text' should be suggested as 'rdfs:label' and the 'dc:type' 
value could be used for the 'rdf:type'. New Entities can be added to the 
knowledge base by sending a "POST http://{host}:{port}/entityhub/entity"; with 
the RDF data of the Entity as content (see the WebUI of your Stanbol instance 
for the detailed documentation).
+2. Distinct suggestion: This means that there is only a single suggestion with 
a high 'fise:confidence'. Also multiple suggestions where the first one as a 
high confidence and additional suggestions come with low confidence values may 
fit this description. In such situations 
+    * the UI might want to automatically accept the suggestion
+    * allow users to show additional suggestion on request.
+    * undo automatic acceptance of the suggestion.
+3. Ambiguous Suggestions: This situation is satisfied if multiple entities are 
suggested with a medium to high 'fise:confidence'. In those cases typically the 
user must provide additional input by
+    * selecting the correct entity
+    * rejecting all suggestions
+    * also manually searching and/or creating a new Entity as described for 
(1) would be possible interaction
+
+The required data for for the described interaction patters are available 
within the enhancement results as follows:
+
+The following assumes {text-annotation} - the URI of the current 
'fise:TextAnnotation' - as context
+
+1. Query for/iterate over all entity suggestions: The suggestions for 
{text-annotation} can be acquired by using "?entityAnnotation dc:relation 
{text-annotation}
+    * only results with the the 'rdf:type' 'fise:EntityAnnotation' should be 
processed. However typically all results will be any way of that type.
+    * the 'fise:confidence' property represents the confidence of the 
suggestion in the range FROM 0 (very uncertain) TO 1 (very certain). Note that 
the 'fise:confidence' value is optional - so there might be EntityAnnotations 
without confidence information. However all [EnhancementEngines managed by the 
Stanbol community](engines/list.html) do provide confidence information.
+2. Visualize suggestions: EntityAnnotations do provide some basic information 
about the suggested Entity that can be used for visualization. Most important 
the URI of the suggested entity as value of 'fise:referenced-entity'. 
Additional the label and the types of the Entity are included.
+3. Retrieving additional information about referenced Entities: While the 
EntityAnnotation includes some basic information some users might want to 
retrieve all available information of referenced Entities - to dereference the 
Entity:
+    * As this is a rather common use case the [EntityLinkingEngine]() and 
[KeywordLinkingEngine]() are by default configured to include information of 
Entities within the EnhancementResults. So users that use those 
EnhancementEngines will not need to dereference Entities as those information 
are already available within the enhancement results.
+    * If a 'fise:EntityAnnotation' has the 'entityhub:site' property Entities 
can be dereferenced by using the Stanbol Entityhub (see the section for 
[fise:EntityAnnotation](#fiseentityannotation) for details)
+    * In all other cases the URI of the suggested entity need to be used for 
dereferencing. If the referenced Entity is part of the [Linked 
Data](http://linkeddata.org/) cloud this is often possible by the 
[CoolURI](http://www.w3.org/TR/cooluris/) - basically sending a "GET -h 
"Accept: application/json+rdf" {entity-uri}".

Added: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png?rev=1344200&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/hallo-annotate_scrrenshot.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream


Reply via email to