thank you, it's less than I hoped for but certainly more than what I can ask for Andy :)
In short I'd like to get the xsd:dateTime scan out of the sparql filter and perform a more efficient range via a date index similar to the jena spatial implementation. I am going to take a look at DateRangeField and see how it performs relative to a standard sparql filter range query. best, Marco On Tue, Feb 27, 2018 at 5:21 PM, Andy Seaborne <[email protected]> wrote: > > On 27/02/18 11:41, Marco Neumann wrote: >> >> Hi Andy, (I presume you wrote the following below) could you please >> elaborate on the significance of this contribution in TDB? > > > Hi Marco, > > For certain XSD datatypes, the value is stored in the NodeId (64 bits, minus > the datatype indicator - 56 bits for TDB1, up to 62 bits for TDB2 for > xsd:doubles) itself. It is faster to get the node back out the database. > > If value does not fit in the bits available, the long form is used. In the > long form, the NodeId is a pointer into the node table and the node is > stoted as the lexical form+datatype (TDB1: in text; TDB2 in binary / RDF > Thrift). This applies to strings and URIs. > >> >> "The xsd:dateTime and xsd:date ranges cover about 8000 years from year >> zero with a precision down to 1 millisecond. Timezone information is >> retained to an accuracy of 15 minutes with special timezones for Z and >> for no explicit timezone." > > > That's the limit for xsd:dataTime in 56 bits. > >> >> https://jena.apache.org/documentation/tdb/architecture.html#inline-values >> >> does this give us enhanced temporal access methods via TDB that are >> exposed as property functions in SPARQL? > > > What exactly are you looking for here? Range queries or a database you can > view at a point in time? ("Temporal database" can mean either.) > > You get the same SPARQL file capabilities but the inline form is faster > (measurable and by quite a lot) because it does not go to the node table. > Despite caching of the node table, it is still faster to get nodes out of > the DB form the inline form (and I'd like to go faster still). > > Point-on-database. > > Not possible in TDB1. > Possible (but not exposed) in TDB2. TDB2 never forgets! > >> In particular I'd be interested in range queries on xsd:dateTime here >> and the possible use of DateRangeField (SOLR) along jena-spatial. > > > Range queries - it would be possible to start in the right place for a range > scan because the values are in sorted order under this design. > > Insert complexity for the different datatypes possible - it might need a > "this is a value centric database" flag so e.g. integers, whether xsd:short > or xsd:??? are stored as binary integers loosing the datatype. > > In TDB1, that's true, TDB2 does keep the original datatype. Both are valid > choices to different use cases. > > Hope that answers your questions, > > Andy > >> >> >> Best, >> Marco >> >> >> > -- --- Marco Neumann KONA
