Improvements to the JSON serialization of Entityhub Representations
-------------------------------------------------------------------

                 Key: STANBOL-300
                 URL: https://issues.apache.org/jira/browse/STANBOL-300
             Project: Stanbol
          Issue Type: Improvement
          Components: Entity Hub
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler
            Priority: Minor


1. Dates:

Internally java.util.Date is used and the JSON serialization uses the 
toString() method to generate the String representation. This needs to be 
changed to produce valid ISO8601 in UTC

2. use JSON data types:

Currently all values are parsed as Strings. Therefore even for numbers String 
values are used in JSON (e.g. for float values "value": "0.53789765"). Native 
JSON types should be used where possible.

3. Entityhub value types and "xsd:datatype":

All values of Representations do have a "type" property in the JSON 
serialization. This is used to encode the type of the value as internally used 
by the Entityhub. The Entityhub distinguishes between Text (natural language), 
Reference (links to other resources) and Value (all other types). For parsing 
values this is not sufficient because clients can not determine the exact 
datatype of the values. Therefore the JSON serialization needs to be extended 
to provide the xsd:datatype of values. The current "type" values are kept for 
now for compatibility reasons but there usage is deprecated

a. natural language texts and xsd:string values:

Internally the entityhub distinguished between string values and natural 
language text. This is also possible for RDF serializations. In case of natural 
language texts there is no "xsd:datatype" and an optional xml:lang present. For 
string values the "xsd:datatype" is set to xsd:string". Even RDF frameworks do 
not allow to set both an xsd:datatype and xml:language on RDF literals.
Therefore the the JSON serialization of the entityhub should use the same.

Example for natural language texts: Natural languages may have a "xml:lang" but 
MUST NOT have a "xsd:datatype" value. 

    "http:\/\/www.w3.org\/2000\/01\/rdf-schema#label": [{
        "type": "text",
        "xml:lang": "en",
        "value": "Natural language value"
    }]

Example for a string value: The notation "Ag" for Silver in the periodic table 
is clearly a String value and not natural language text.

    "http:\/\/www.w3.org\/2004\/02\/skos\/core#notation": [{
        "type": "value",
        "xsd:datatype": "xsd:string",
        "value": "Ag"
    }]


b. References and "xsd:anyUri". The Entityhub allows both the usage of 
"entityhub:reference" and "xsd:anyUri" for references. Internally both are 
handled as by the Reference interface. Because of this the JSON serialization 
should use "xsd:anyUri" as "xsd:datatype"

     "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type": [{
         "type": "reference",
         "xsd:datatype": "xsd:anyUri",
         "value": "http:\/\/www.w3.org\/2002\/07\/owl#Thing"
      }]

c. Other supported types: The entityhub supports several (but not all) 
xsd:datatypes. See the definitions in the 
org.apache.stanbol.entityhub.servicesapi.defaults.NamespaceEnum for details. 
All such values will have "type": "value" and "xsd:datatype": "{xsd-datatype}" 
as properties.

    "http:\/\/www.iks-project.eu\/ontology\/rick\/query\/score": [{
        "type": "value",
        "xsd:datatype": "xsd:float",
        "value": "0.53789765"
     }]



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to