TextAnnotations should use PlainLiterals instead of TypesLiterals for the 
selected-text and context
---------------------------------------------------------------------------------------------------

                 Key: STANBOL-509
                 URL: https://issues.apache.org/jira/browse/STANBOL-509
             Project: Stanbol
          Issue Type: Improvement
          Components: Enhancer
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler
            Priority: Minor


Currently all EnhancementEngines that create TextAnnotations use TypedLiterals 
of the type xsd:string for values of the fise:selected-text and fise:context 
properties. However both values are in fact natural language text therefore it 
would be better to use PlainLiterals and also add the langage as detected for 
the parsed content.

Example:

parsed Content: "The Stanbol enhancer can detect famous cities such as Paris 
and people such as Bob Marley."
Detected lanauge: "en"
Text Annotations: "Paris" and "Bob Marley"

currently the selection context and the selected-text would be represented like:

    <fise:selection-context 
rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>The Stanbol enhancer can 
detect famous cities such as Paris and people such as Bob 
Marley.</j.7:selection-context>
    <fise:selected-text 
rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>Bob 
Marley</j.7:selected-text>

after this issue is resolved the same information would be represented like

    <fise:selection-context xml:lang="en">The Stanbol enhancer can detect 
famous cities such as Paris and people such as Bob 
Marley.</j.7:selection-context>
    <fise:selected-text xml:lang="en">Bob Marley</j.7:selected-text>

Advantages:

* The suggested representation is more in line with the semantic meaning
* Engines that consume text selections could use the language as provided by 
current TextAnnotation. This would allow to correctly search for entities in 
documents containing parts in multiple languages.
    * Still such engines could use the language annotation for the document as 
fallback if no language is provided by TextAnnotations (backward compatibility) 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to