Well, the RDF specifications fundamentally define RDF literals to be the following:
* a lexical form, being a Unicode [UNICODE<https://www.w3.org/TR/rdf11-concepts/#bib-UNICODE>] string, which should be in Normal Form C [NFC<https://www.w3.org/TR/rdf11-concepts/#bib-NFC>], https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal So, you are effectively forced to use some sort of string-based encoding of the binary data to represent any literal, whether that underlying datatype is truly binary data. Now in principle you could define a custom implementation of the LiteralLabel interface that stores the value as true binary, i.e. byte[], and only materialises it into a string encoding when absolutely necessary. This could then be used to create instances via NodeFactory.create(LiteralLabel). However, data into and out of the system is generally going to be via a RDF serialisation, which again will require string encoding or decoding as appropriate. And the parsers don’t really care about datatypes so your custom implementation wouldn’t get used. Thus, whether a custom LiteralLabel would actually gain you anything would depend on how the data is coming into the system and how you consume it. If the data is coming in via some programmatic means that isn’t parsing serialised RDF then maybe but I don’t think it would gain you much. For spatial indexing generally the approach of a GeoSPARQL implementation is to build the spatial index up-front so you’d only pay the cost of the string to binary decoding once when the index was first built from the RDF data. The spatial index is going to convert the incoming geo-data into its own internal index structures that will be very efficient to access, at which point whether the binary data was originally string encoded is irrelevant. Regards, Rob Vesse From: Nicholas Car <n...@kurrawong.net> Date: Wednesday, 3 May 2023 at 23:22 To: users@jena.apache.org <users@jena.apache.org> Subject: Re: Binary literals I see Base64 is an XSD option too, but I’m most interested in “true” binary, as opposed to binary-as-text options, and whether any exist! Nick On Thu, May 4, 2023 at 8:13 am, Nicholas Car <[n...@kurrawong.net](mailto:On Thu, May 4, 2023 at 8:13 am, Nicholas Car <<a href=)> wrote: > Dear Jena users, > > How can I store binary literals in RDF and in Jena/Fuseki? > > There is xsd:hexBinary for arbitrary binary data but is there a better/more > efficient/another way to store binary literals in Jena? > > The reason I ask is that a future version of GeoSPARQL might want to include > WKB - Well-Known Binary - as a geometry format option. We would hope this can > be efficiently accessed by a spatial index so we want to know how to handle > perhaps a custom data type, perhaps geo:wkbLiteral, and how best to store > this in Jena, perhaps not as hex text. > > Thanks, Nick