Rupert Westenthaler created STANBOL-617:
-------------------------------------------

             Summary: Define how TopicEnhancements are written to the 
Enhancement Structure
                 Key: STANBOL-617
                 URL: https://issues.apache.org/jira/browse/STANBOL-617
             Project: Stanbol
          Issue Type: Bug
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler
            Priority: Minor
             Fix For: 0.10.0-incubating


In future three Enhancement Engine will annotate Topics extracted form analyzed 
ContentItems

* Topic Engine
* Zemanta Engine
* CELI Classification Engine (See STANBOL-583)

While all do annotate Topics very similar there are some small variations that 
need to be aligned to make it easier for users to consume those annotations.

Topic Annotation are a special type of Annotation that is very similar to a 
fise:EntityAnnotation. The following listing shows expected triples

(1)    ?ta rdf:type fise:TopicAnnotation
(2)    ?ta fise:entity-reference ?topic-uri
(3)    ?ta fise:entity-label ?topic-label
(4)    ?ta fise:entity-type ?topic-type
(5)    ?ta dc:relation ?ta

(6)    ?ta rdf:type fise:TextAnnotation
(7)    ?ta fise:start ?sectionStartPos
(8)    ?ta fise:end ?sectionEndPos

(1,3,5,6) are required
(2) defines the URI of the assigned Topic. This might not be available in case 
the Topic has only a label but is not formally assigned an unique ID
(4) the type of the Topic. It is strongly suggested to use skos:Concept as type.

(6,7,8) do link the fise:TopicAnnotation with the text. (7,8) are required if a 
topic needs to be assigned to an sub-section of the analyzed content. 
NOTE: fise:selected-text and fise:selection-context are not used in this 
example as those text could be very huge for bigger sections. Here we would 
need to define a better way to define the context for TextAnnotations that 
select whole sections of the parsed content.

As far as I know the TopicEngine already follows this approach. The 
ZemantaEngine and the CELI Classification Engine need to be adapted (as part of 
this Issue) to conform to the defined structure.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to