Jens Lehmann wrote:
Hello Tim,

Tim Haynes wrote:
Jens Lehmann wrote:

Is there an easy way to query the number of triples in a graph
(http://geonames.org) or the whole triple store? That way I could see
how many there are and whether this number changes.
sparql SELECT count(*) from <http://geonames.org> WHERE {?s ?p ?o}

or

select count(*)
from DB.DBA.RDF_QUAD
where G = DB.DBA.RDF_MAKE_IID_OF_QNAME('http://geonames.org');

perhaps?

I already tried the first one before, but it takes some time to run
(hence the question). Finally, it outputs 68639869, which is 73% of
total Geonames triple count. On second invocation (after waiting 20
minutes), the same number is returned. But since the query is fast on
second invocation, the result might be taken from cache.

Jens,

The result is not cached but the triples are. The query is no longer
disk bound, thus runs faster.

SQL select is the one I would use,  but using IRI_TO_ID is quicker to
write than RDF_MAKE_IID_OF_QNAME - they're the same function.
If the triple count does not match the dataset, I would suspect the
loader rather than the triple store in this case.

Counting stuff is slow when you need to read the stuff you
count from disk. If you need faster disk access I recommend using more
disks and spread them on different SATA buses (if you've SATA disk) and
tell Virtuoso to use stripes. This way you'll have more Disk I/O
bandwidth and parallelization.

Yrjänä

Kind regards,

Jens




--
Yrjana Rankka            | gh...@openlinksw.com
Developer, Virtuoso Team | http://www.openlinksw.com
                        | Making Technology Work For You




Reply via email to