TextAnnotations should use PlainLiterals instead of TypesLiterals for the
selected-text and context
---------------------------------------------------------------------------------------------------
Key: STANBOL-509
URL: https://issues.apache.org/jira/browse/STANBOL-509
Project: Stanbol
Issue Type: Improvement
Components: Enhancer
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
Currently all EnhancementEngines that create TextAnnotations use TypedLiterals
of the type xsd:string for values of the fise:selected-text and fise:context
properties. However both values are in fact natural language text therefore it
would be better to use PlainLiterals and also add the langage as detected for
the parsed content.
Example:
parsed Content: "The Stanbol enhancer can detect famous cities such as Paris
and people such as Bob Marley."
Detected lanauge: "en"
Text Annotations: "Paris" and "Bob Marley"
currently the selection context and the selected-text would be represented like:
<fise:selection-context
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The Stanbol enhancer can
detect famous cities such as Paris and people such as Bob
Marley.</j.7:selection-context>
<fise:selected-text
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Bob
Marley</j.7:selected-text>
after this issue is resolved the same information would be represented like
<fise:selection-context xml:lang="en">The Stanbol enhancer can detect
famous cities such as Paris and people such as Bob
Marley.</j.7:selection-context>
<fise:selected-text xml:lang="en">Bob Marley</j.7:selected-text>
Advantages:
* The suggested representation is more in line with the semantic meaning
* Engines that consume text selections could use the language as provided by
current TextAnnotation. This would allow to correctly search for entities in
documents containing parts in multiple languages.
* Still such engines could use the language annotation for the document as
fallback if no language is provided by TextAnnotations (backward compatibility)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira