Re: Data partitioning dilemma (named graphs)

Dave Reynolds Wed, 04 May 2011 00:28:43 -0700

Hi Martynas,

On Tue, 2011-05-03 at 22:11 +0200, Martynas Jusevicius wrote: 
> Thanks Dave.
> 
> Right now I have a single named graph for all ontologies, but I guess
> a graph per ontology makes more sense.


Didn't mean to imply that was a requirement, depends on what you want to
do.

> I need to iterate through all ontology classes however, so I still
> need a unified ontology model - how do I achieve that? I know of
> ModelFactory.createUnion(), but it only works on model pairs.

If that's a requirement then sticking to one graph for the combined
ontologies is just fine.

If you do want separate graphs but also want a union around then you can
create multi-way dynamic unions using OntModel.addSubModel.

> Speaking of your provenance work - how did you attach the UUID URI to
> the triple in the data graph without using reification?

In my case I had a provenance API to hide the details (I had both a
multiple graph and a reified-by-hash implementation, different
tradeoffs). For the UUID version I created a lexical form for the S, P,
O, did an MD5 digest of those and then wrapped that up as a urn:uuid
(i.e. a type 3 UUID). Then used that urn:uuid resource as the subject of
the provenance statements. That worked because (a) there were no bNodes
other than ones with stable internal anonIDs and (b) I only needed to go
from a statement to its provenance. If you need to retrieve the
statements themselves starting from provenance information then use the
reification vocabulary or named graphs.

Cheers,
Dave

> Martynas
> 
> On Tue, May 3, 2011 at 6:29 PM, Dave Reynolds <[email protected]> 
> wrote:
> > Hi Martynas,
> >
> > On Mon, 2011-05-02 at 10:51 +0200, Martynas Jusevicius wrote:
> >> Hey list,
> >>
> >> I want to improve provenance of RDF data in my app, and I'm mostly
> >> looking at named graphs since reification seems not be used that much.
> >>
> >> One point of view is logical divisions:
> >> - read-only core ontologies
> >> - user ontologies
> >> - user instance data
> >> I could make a named graph for each of them.
> >>
> >> The other is that I'd like to have metadata about every added/updated
> >> triple so the app could say "User X updated resource Y with value of Z
> >> on date W". In this case basically every triple should have its own
> >> unique URI - i.e. be a named graph with a single statement?
> >>
> >> It seems that I could implement either the first case or the second
> >> with named graphs, but not both, which I would prefer.
> >> How would you go about it - has anyone worked on use cases like this?
> >> Should I still consider reification - and maybe use it together with
> >> named graphs?
> >
> > I guess it depends on how you want to manage the data,  whether you need
> > to limit queries to particular sub-categories of data and just how much
> > data you are talking about.
> >
> > In principle you could have a separate named graph both for each
> > ontology and for each atomic addition of user triples plus a separate
> > metadata graph. If atomic additions are made one triple at a time that
> > would be a lot of named graphs but it is possible.
> >
> > If your updates include retractions than that gets messier in that you
> > have to remove the old graph as well as add to the new one, still
> > possible I guess.
> >
> > FWIW the last time I did serious work with triple level provenance
> > (which was before named graphs were so much in vogue) I worked it with
> > just two graphs - one for the asserted data and one for the metadata.
> > The metadata graph could have used the reification vocabulary but I
> > found it easier to generate a hash to identify the triple in the data
> > graph and then use the hash (as a UUID URI) as the subject of provenance
> > triples in the metadata graph. That's isomorphic to using reification
> > but is more compact and easier to query.
> >
> > Dave
> >
> >
> >

Re: Data partitioning dilemma (named graphs)

Reply via email to