On 12/05/14 18:26, Tim Harsch wrote:
According to the docs:
http://jena.apache.org/documentation/notes/typed-literals.html
These are all available as static member variables from
com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/datatypes/xsd/XSDDatatype.html>
.
Of these types, the following are registered as the default type to use to
represent certain Java classes:
Java class xsd type Float float Double double Integer int Long long
Short short Byte byte BigInteger integer BigDecimal decimal Boolean
Boolean String string
This is what I am seeing for xsd:short and xsd:byte. I'm puzzled by the
type from getValue.
CODE:
System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
System.out.println( "getValue java class: " +
((Literal)literal).getValue().getClass()
);
OUTPUT:
RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
java.lang.Byte]
Datatype URI: http://www.w3.org/2001/XMLSchema#byte
getValue java class: class java.lang.Integer
RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
java.lang.Short]
Datatype URI: http://www.w3.org/2001/XMLSchema#short
getValue java class: class java.lang.Integer
So, is the expected behavior?
Yes, or at least that's the implemented behaviour and has been for some
time.
The getValue() code picks a Java datatype big enough for the actual
value out of Integer, Long and BigInteger.
Arguably it would be better if it round tripped so that a java short
would become an xsd:short and would return a Short from getValue.
The issue is largely historical. Partly its that the code was developed
while the RDF datatype handling was still in flux. Partly it's
convenience - a lot of people use xsd:integer (i.e. arbitrary size) in
their RDF (because that's what you get in Turtle if you use number
syntax) but expect them to be Integers in java "unless they are too
big". Round-tripping from java was never a requirement. Having once
implemented it that way we created a backward compatibility issue if we
wanted to change it.
I suspect that changing so that short and byte round tripped would be
OK. But equally I suspect that dropping the truncation of smaller
BigIntegers to Integers would cause problems.
This might be something to revisit in any future Jena 3 though doesn't
seem like much of a priority - xsd:byte or xsd:short don't seem to be
very much used in RDF in the wild.
Dave