Re: Inline Values and XSD Time Series

Marco Neumann Wed, 28 Feb 2018 09:53:31 -0800

thank you, it's less than I hoped for but certainly more than what I
can ask for Andy :)


In short I'd like to get the xsd:dateTime scan out of the sparql
filter and perform a more efficient range via a date index similar to
the jena spatial implementation.

I am going to take a look at DateRangeField  and see how it performs
relative to a standard sparql filter range query.

best,
Marco


On Tue, Feb 27, 2018 at 5:21 PM, Andy Seaborne <[email protected]> wrote:
>
> On 27/02/18 11:41, Marco Neumann wrote:
>>
>> Hi Andy, (I presume you wrote the following below) could you please
>> elaborate on the significance of this contribution in TDB?
>
>
> Hi Marco,
>
> For certain XSD datatypes, the value is stored in the NodeId (64 bits, minus
> the datatype indicator - 56 bits for TDB1, up to 62 bits for TDB2 for
> xsd:doubles) itself. It is faster to get the node back out the database.
>
> If value does not fit in the bits available, the long form is used.  In the
> long form, the NodeId is a pointer into the node table and the node is
> stoted as the lexical form+datatype (TDB1: in text; TDB2 in binary / RDF
> Thrift). This applies to strings and URIs.
>
>>
>> "The xsd:dateTime and xsd:date ranges cover about 8000 years from year
>> zero with a precision down to 1 millisecond. Timezone information is
>> retained to an accuracy of 15 minutes with special timezones for Z and
>> for no explicit timezone."
>
>
> That's the limit for xsd:dataTime in 56 bits.
>
>>
>> https://jena.apache.org/documentation/tdb/architecture.html#inline-values
>>
>> does this give us enhanced temporal access methods via TDB that are
>> exposed as property functions in SPARQL?
>
>
> What exactly are you looking for here? Range queries or a database you can
> view at a point in time? ("Temporal database" can mean either.)
>
> You get the same SPARQL file capabilities but the inline form is faster
> (measurable and by quite a lot) because it does not go to the node table.
> Despite caching of the node table, it is still faster to get nodes out of
> the DB form the inline form (and I'd like to go faster still).
>
> Point-on-database.
>
> Not possible in TDB1.
> Possible (but not exposed) in TDB2.  TDB2 never forgets!
>
>> In particular I'd be interested in range queries on xsd:dateTime  here
>> and the possible  use of DateRangeField (SOLR) along jena-spatial.
>
>
> Range queries - it would be possible to start in the right place for a range
> scan because the values are in sorted order under this design.
>
> Insert complexity for the different datatypes possible - it might need a
> "this is a value centric database" flag so e.g. integers, whether xsd:short
> or xsd:??? are stored as binary integers loosing the datatype.
>
> In TDB1, that's true, TDB2 does keep the original datatype. Both are valid
> choices to different use cases.
>
> Hope that answers your questions,
>
>     Andy
>
>>
>>
>> Best,
>> Marco
>>
>>
>>
>



-- 


---
Marco Neumann
KONA

Re: Inline Values and XSD Time Series

Reply via email to