Hi Tim,

On 13/05/14 18:53, Tim Harsch wrote:
Thanks Dave.  Makes sense.   Why though does RDFDatatype says the class
would be Byte and would be Short ?  I guess there is no code that consults
RDFDatatype to ask what they type should be before creating it.   Is this
just an inconsistency in the API?  Or bug in the code?

Arguably an insufficiently clear javadoc.

The issue is that the TypeMapper, which tells you what datatype to use when *encoding* a java type, is currently initialized from the getJavaClass() for those datatypes. We wanted people to be able to use shorts and bytes in java and still get them encoded appropriately.

Which is why the javadoc for RDFDatatype#getJavaClass says:

"""
If this datatype is used as the cannonical representation for a particular java datatype then return that java type, otherwise returns null.
"""

I.e. it records the java to xsd mapping, which is not the same as the xsd to java mapping if we don't enforce strict round tripping.

In fact the type mapper allows you to register types directly, which allows us to have a many-to-one map from java class to RDF datatype. So the use of getJavaClass is not really necessary and arguably confusing in a world without round tripping guarantees.

Dave

On Tue, May 13, 2014 at 12:51 AM, Dave Reynolds
<dave.e.reyno...@gmail.com>wrote:

On 12/05/14 18:26, Tim Harsch wrote:

According to the docs:
http://jena.apache.org/documentation/notes/typed-literals.html

These are all available as static member variables from
com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.
apache.org/documentation/javadoc/jena/com/hp/hpl/jena/
datatypes/xsd/XSDDatatype.html>

.

Of these types, the following are registered as the default type to use to
represent certain Java classes:
    Java class xsd type   Float float  Double double  Integer int  Long
long
Short short  Byte byte  BigInteger integer  BigDecimal decimal  Boolean
Boolean  String string

This is what I am seeing for xsd:short and xsd:byte.  I'm puzzled by the
type from getValue.

CODE:

System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
System.out.println( "getValue java class: " +
((Literal)literal).getValue().getClass()
);

OUTPUT:

RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
java.lang.Byte]
Datatype URI: http://www.w3.org/2001/XMLSchema#byte
getValue java class: class java.lang.Integer
RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
java.lang.Short]
Datatype URI: http://www.w3.org/2001/XMLSchema#short
getValue java class: class java.lang.Integer

So, is the expected behavior?


Yes, or at least that's the implemented behaviour and has been for some
time.

The getValue() code picks a Java datatype big enough for the actual value
out of Integer, Long and BigInteger.

Arguably it would be better if it round tripped so that a java short would
become an xsd:short and would return a Short from getValue.

The issue is largely historical. Partly its that the code was developed
while the RDF datatype handling was still in flux. Partly it's convenience
- a lot of people use xsd:integer (i.e. arbitrary size) in their RDF
(because that's what you get in Turtle if you use number syntax) but expect
them to be Integers in java "unless they are too big". Round-tripping from
java was never a requirement. Having once implemented it that way we
created a backward compatibility issue if we wanted to change it.

I suspect that changing so that short and byte round tripped would be OK.
But equally I suspect that dropping the truncation of smaller BigIntegers
to Integers would cause problems.

This might be something to revisit in any future Jena 3 though doesn't
seem like much of a priority - xsd:byte or xsd:short don't seem to be very
much used in RDF in the wild.

Dave




Reply via email to