I haven't used it for RDF storage, but the page for SWI-Prolog's Semantic Web library (www.swi-prolog.org/packages/semweb.html) claims to have been "actively used with up to 10 million triples, using approximately 1GB of memory." I wonder if RAM is becoming faster/cheaper at a sufficiently fast rate to keep up with or outpace the growth of our databases of RDF triples - I suspect not.

Ora Lassila wrote:
Matt,

what kind of an in-memory database do you use? I have done some preliminary
experiments with UniProt etc. data with about 2 million triples using our
OINK browser (built using the Wilbur toolkit). Performance was very
"interactive" (i.e., "snappy", notice my highly precise metrics here ;-) on
a 1.67 GHZ Powerbook w/ 1 GB RAM.

I don't think 2M triples is a limit on the above configuration, I just
happened to use a dataset of such size. I will run bigger tests soon.

One should also take into account that in my experiments I was running our
RDF(S) reasoner also. It computes everything on-demand. Effectively there
were therefore more than 2M triples. One observation is that RDF graphs
often tend to have a higher fan-out going "backwards" than "forwards" (i.e.,
when traversing arcs in the inverse direction); typical examples of such
relations are rdf:type and rdfs:subClassOf. OINK supports inverse traversal.

I'd like to know what kinds of datasets people are using, what kind of (RDF
triple store) implementations they are using, and what are the observations
about performance.

Regards,

    - Ora

Reply via email to