On 21/01/12 18:21, Benson Margulies wrote:
This isn't really a 'load' scenario. I've done more profiling since I
started that thread.

A process is creating new RDF on the fly. It just makes 'add' calls to
the model obtained from the TDB default graph.

There is, sadly, one case which I implemented with reification. When a
document wanders by which triggers thousands of these events, the code
bogs down. Not so much in adding the reifications, as in checking for
existing ones, which is what it has to do.

Maybe theer is a better way - can you share the profiling? It may be better not to check ... and let TDB suppress duplicates.

TDB reification support is special - it's pure code and implements the contract but being stateless the DB knows nothing of reification. We have been thinking of making this the usual way because reification in RDF generally is nowadays for specialised use only.

I'm considering a scheme in which I feed the three URI's of the
statement into MD5, and 'reifiy' by adding statements like:

   urn:<md5>      HAS_PROVENANCE WHATEVER

instead of using the formal reification system.

Of course, in an imaginary perfect world, TDB would somehow know about
reification.

Been there, done that [not me personally] :-)

Once upon-a-time, RDB (the old relational DB engine) did reification. While it gets good compactness, the complexity of managing partial reifications is huge and the payback is small.

Named graphs can be used for keeping statements separated.

All that said, I'd like to do property tables for TDB (string a set of properties per subject together) but that it's not a priority.

        Andy



Reply via email to