I haven't used it for RDF storage, but the page for SWI-Prolog's Semantic Web library (www.swi-prolog.org/packages/semweb.html) claims to have been "actively used with up to 10 million triples, using approximately 1GB of memory." I wonder if RAM is becoming faster/cheaper at a sufficiently fast rate to keep up with or outpace the growth of our databases of RDF triples - I suspect not.
Ora Lassila wrote:
Matt, what kind of an in-memory database do you use? I have done some preliminary experiments with UniProt etc. data with about 2 million triples using our OINK browser (built using the Wilbur toolkit). Performance was very "interactive" (i.e., "snappy", notice my highly precise metrics here ;-) on a 1.67 GHZ Powerbook w/ 1 GB RAM. I don't think 2M triples is a limit on the above configuration, I just happened to use a dataset of such size. I will run bigger tests soon. One should also take into account that in my experiments I was running our RDF(S) reasoner also. It computes everything on-demand. Effectively there were therefore more than 2M triples. One observation is that RDF graphs often tend to have a higher fan-out going "backwards" than "forwards" (i.e., when traversing arcs in the inverse direction); typical examples of such relations are rdf:type and rdfs:subClassOf. OINK supports inverse traversal. I'd like to know what kinds of datasets people are using, what kind of (RDF triple store) implementations they are using, and what are the observations about performance. Regards, - Ora