Glenn Proctor wrote:
> Hi
>
> I have a TDB instance (0.8.10) containing about 207m triples. I've run
> tdbstats and moved stats.opt into the appropriate place.
>
> I've noticed that running the same query multiple times in succession
> results in successively shorter query times, up to a point. For
> example, on an otherwise-idle TDB instance, the query
>
> SELECT ?facet ?val (COUNT(?val) as ?vc) WHERE { ?id a ?val . ?id
> ?facet ?val . } GROUP BY ?facet ?val ORDER BY DESC(?vc) LIMIT 25
>
> Takes 3707s, then 1424s, then 345s where it seems to stay for subsequent runs.
Hi Glenn,
I do not know your use cases and, in particular, I do not know if you
are trying to provide a faceted navigation UI on top of your TDB store.
But, from your query that seems the case.
If those times are seconds, that is not going to provide a good user
experience to your users. ;-)
I do not know if your store is mostly read-only with just a few, non
frequent and small updates, but if that is the case, you should really
consider putting a caching layer in front of your TDB store.
An experimental prototype Andy wrote is here:
- https://github.com/afs/LD-Access
A completely different alternative would be to use something such as
Apache Solr or ElasticSearch along side your TDB store, they both
support facet searches (and they can be quite fast):
- http://wiki.apache.org/solr/SimpleFacetParameters
- http://www.elasticsearch.org/guide/reference/api/search/facets/
None of these options are something you get out-of-the-box though,
some work and development is involved.
My 2 cents,
Paolo
>
> What's the reason for this initial improvement and subsequent tailing
> off - are the indexes being optimised with every query?
>
> Glenn.