My 2 cents:  Base 64 might be preferable to Hex encoding since it is inherently 
more compact

Rob

From: Nicholas Car <n...@kurrawong.net>
Date: Thursday, 4 May 2023 at 10:58
To: users@jena.apache.org <users@jena.apache.org>
Subject: Re: Binary literals
Hi Rob,

Thanks for this: it is pretty much as I thought!

I think we will be able to cater for WKB then in GeoSPARQL 1.3 with just hex 
encoding of the value and ^^geo:wkbLiteral and then, as you say, implementers, 
like Jena-geosparql, can just read the hex into their spatial indexes one-time.

I see little value in this other than meeting an allowed data type in the 
Simple Features standard, then again, I see little value in KML and other 
existing, allowed, formats too!

Cheers, Nick




------- Original Message -------
On Thursday, May 4th, 2023 at 18:30, Rob @ DNR <rve...@dotnetrdf.org> wrote:


> Well, the RDF specifications fundamentally define RDF literals to be the 
> following:
>
> * a lexical form, being a Unicode 
> [UNICODEhttps://www.w3.org/TR/rdf11-concepts/#bib-UNICODE] string, which 
> should be in Normal Form C [NFChttps://www.w3.org/TR/rdf11-concepts/#bib-NFC],
>
> https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal
>
> So, you are effectively forced to use some sort of string-based encoding of 
> the binary data to represent any literal, whether that underlying datatype is 
> truly binary data.
>
> Now in principle you could define a custom implementation of the LiteralLabel 
> interface that stores the value as true binary, i.e. byte[], and only 
> materialises it into a string encoding when absolutely necessary. This could 
> then be used to create instances via NodeFactory.create(LiteralLabel).
>
> However, data into and out of the system is generally going to be via a RDF 
> serialisation, which again will require string encoding or decoding as 
> appropriate. And the parsers don’t really care about datatypes so your custom 
> implementation wouldn’t get used. Thus, whether a custom LiteralLabel would 
> actually gain you anything would depend on how the data is coming into the 
> system and how you consume it. If the data is coming in via some programmatic 
> means that isn’t parsing serialised RDF then maybe but I don’t think it would 
> gain you much.
>
> For spatial indexing generally the approach of a GeoSPARQL implementation is 
> to build the spatial index up-front so you’d only pay the cost of the string 
> to binary decoding once when the index was first built from the RDF data. The 
> spatial index is going to convert the incoming geo-data into its own internal 
> index structures that will be very efficient to access, at which point 
> whether the binary data was originally string encoded is irrelevant.
>
> Regards,
>
> Rob Vesse
>
> From: Nicholas Car n...@kurrawong.net
>
> Date: Wednesday, 3 May 2023 at 23:22
> To: users@jena.apache.org users@jena.apache.org
>
> Subject: Re: Binary literals
> I see Base64 is an XSD option too, but I’m most interested in “true” binary, 
> as opposed to binary-as-text options, and whether any exist!
>
> Nick
>
> On Thu, May 4, 2023 at 8:13 am, Nicholas Car <[n...@kurrawong.net](mailto:On 
> Thu, May 4, 2023 at 8:13 am, Nicholas Car <<a href=)> wrote:
>
> > Dear Jena users,
> >
> > How can I store binary literals in RDF and in Jena/Fuseki?
> >
> > There is xsd:hexBinary for arbitrary binary data but is there a better/more 
> > efficient/another way to store binary literals in Jena?
> >
> > The reason I ask is that a future version of GeoSPARQL might want to 
> > include WKB - Well-Known Binary - as a geometry format option. We would 
> > hope this can be efficiently accessed by a spatial index so we want to know 
> > how to handle perhaps a custom data type, perhaps geo:wkbLiteral, and how 
> > best to store this in Jena, perhaps not as hex text.
> >
> > Thanks, Nick

Reply via email to