Re: Writing a whole lot of RDF to TDB versus Jena

Andy Seaborne Sun, 22 Jan 2012 12:22:16 -0800

On 21/01/12 18:21, Benson Margulies wrote:

This isn't really a 'load' scenario. I've done more profiling since I
started that thread.


A process is creating new RDF on the fly. It just makes 'add' calls to
the model obtained from the TDB default graph.

There is, sadly, one case which I implemented with reification. When a
document wanders by which triggers thousands of these events, the code
bogs down. Not so much in adding the reifications, as in checking for
existing ones, which is what it has to do.

Maybe theer is a better way - can you share the profiling? It may bebetter not to check ... and let TDB suppress duplicates.

TDB reification support is special - it's pure code and implements thecontract but being stateless the DB knows nothing of reification. Wehave been thinking of making this the usual way because reification inRDF generally is nowadays for specialised use only.

I'm considering a scheme in which I feed the three URI's of the
statement into MD5, and 'reifiy' by adding statements like:

   urn:<md5>      HAS_PROVENANCE WHATEVER

instead of using the formal reification system.

Of course, in an imaginary perfect world, TDB would somehow know about
reification.


Been there, done that [not me personally] :-)

Once upon-a-time, RDB (the old relational DB engine) did reification.While it gets good compactness, the complexity of managing partialreifications is huge and the payback is small.


Named graphs can be used for keeping statements separated.

All that said, I'd like to do property tables for TDB (string a set ofproperties per subject together) but that it's not a priority.


        Andy

Re: Writing a whole lot of RDF to TDB versus Jena

Reply via email to