I think you can exploit the fact that dates are inlined in order in the
indexes. You would transform the filter into a graph pattern operator that
is able to do a range index scan. In this particular example, it would
probably use the POS index [1] to retrieve only the required triples without
having to touch any unrelated triples during the scan.
select *
{
?s <http://purl.org/dc/elements/1.1/date>
("2011-03-03T00:00:00Z"^^xsd:dateTime < ?date <
"2011-06-06T00:00:00Z"^^xsd:dateTime) .
}
This should also work for inequalities on other value types, with some
trickiness if you allow non-inlined values in your system (i.e. integers
greater than 56-bits).
-Stephen
[1] Of course this is making an assumption that there are fewer statements
with dates that match the filter than subjects with that predicate. It
would be up to a cost based optimizer to decide if the PSO index was more
selective in this case.
> -----Original Message-----
> From: Paolo Castagna [mailto:[email protected]]
> Sent: Wednesday, October 19, 2011 4:38 PM
> To: [email protected]
> Subject: On SPARQL queries with FILTER ( ?date < "..."^^xsd:dateTime )
>
> Hi,
> a query pattern I often see is filtering by some xsd:dateTime interval,
> for
> example:
>
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
> SELECT * {
> ?s <http://purl.org/dc/elements/1.1/date> ?date .
> FILTER ( ( ?date > "2011-03-03T00:00:00Z"^^xsd:dateTime ) &&
> ( ?date < "2011-06-06T00:00:00Z"^^xsd:dateTime ) )
> }
>
> Even with moderate size stores this query can take quite a while to
> execute.
> I'd like to know if there is something I could do to speedup these kind
> of
> queries.
>
> I understand that the xsd:dateTime value is encoded by the
> NodeTableInline.
> However, I am not sure this is exploited at query time or I'd llike to
> understand if there is something we could do better to further improve
> performances of queries similar to the one above.
>
> Thanks,
> Paolo