Improvements to the JSON serialization of Entityhub Representations
-------------------------------------------------------------------
Key: STANBOL-300
URL: https://issues.apache.org/jira/browse/STANBOL-300
Project: Stanbol
Issue Type: Improvement
Components: Entity Hub
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
1. Dates:
Internally java.util.Date is used and the JSON serialization uses the
toString() method to generate the String representation. This needs to be
changed to produce valid ISO8601 in UTC
2. use JSON data types:
Currently all values are parsed as Strings. Therefore even for numbers String
values are used in JSON (e.g. for float values "value": "0.53789765"). Native
JSON types should be used where possible.
3. Entityhub value types and "xsd:datatype":
All values of Representations do have a "type" property in the JSON
serialization. This is used to encode the type of the value as internally used
by the Entityhub. The Entityhub distinguishes between Text (natural language),
Reference (links to other resources) and Value (all other types). For parsing
values this is not sufficient because clients can not determine the exact
datatype of the values. Therefore the JSON serialization needs to be extended
to provide the xsd:datatype of values. The current "type" values are kept for
now for compatibility reasons but there usage is deprecated
a. natural language texts and xsd:string values:
Internally the entityhub distinguished between string values and natural
language text. This is also possible for RDF serializations. In case of natural
language texts there is no "xsd:datatype" and an optional xml:lang present. For
string values the "xsd:datatype" is set to xsd:string". Even RDF frameworks do
not allow to set both an xsd:datatype and xml:language on RDF literals.
Therefore the the JSON serialization of the entityhub should use the same.
Example for natural language texts: Natural languages may have a "xml:lang" but
MUST NOT have a "xsd:datatype" value.
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label": [{
"type": "text",
"xml:lang": "en",
"value": "Natural language value"
}]
Example for a string value: The notation "Ag" for Silver in the periodic table
is clearly a String value and not natural language text.
"http:\/\/www.w3.org\/2004\/02\/skos\/core#notation": [{
"type": "value",
"xsd:datatype": "xsd:string",
"value": "Ag"
}]
b. References and "xsd:anyUri". The Entityhub allows both the usage of
"entityhub:reference" and "xsd:anyUri" for references. Internally both are
handled as by the Reference interface. Because of this the JSON serialization
should use "xsd:anyUri" as "xsd:datatype"
"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type": [{
"type": "reference",
"xsd:datatype": "xsd:anyUri",
"value": "http:\/\/www.w3.org\/2002\/07\/owl#Thing"
}]
c. Other supported types: The entityhub supports several (but not all)
xsd:datatypes. See the definitions in the
org.apache.stanbol.entityhub.servicesapi.defaults.NamespaceEnum for details.
All such values will have "type": "value" and "xsd:datatype": "{xsd-datatype}"
as properties.
"http:\/\/www.iks-project.eu\/ontology\/rick\/query\/score": [{
"type": "value",
"xsd:datatype": "xsd:float",
"value": "0.53789765"
}]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira