Re: Controlling blank node IDs

Paolo Castagna Sun, 25 Mar 2012 05:43:47 -0700

Hi Martynas,
since I experienced some of your pains in a different context I want
to share my solution (it might be useful to you or others).

My problem was how to deal with blank nodes in MapReduce jobs or in a
sequence of MapReduce jobs. The solution I found works for N-Triples
and N-Quads formats so it might not be useful in your case, since you
want a single RDF/XML file.

Utils.createParserProfile(...) method creates a ParserProfile:
https://github.com/castagna/tdbloader4/blob/f5363fa49d16a04a362898c1a5084ade620ee81b/src/main/java/org/apache/jena/tdbloader4/Utils.java

The ParserProfile is using a custom LabelToNode object:
https://github.com/castagna/tdbloader4/blob/f5363fa49d16a04a362898c1a5084ade620ee81b/src/main/java/org/apache/jena/tdbloader4/io/MapReduceLabelToNode.java
MapReduceLabelToNode extends LabelToNode and uses a MapReduceAllocator
which implements Allocator<String, Node>.

This is not exactly what you want, but maybe you can do something
very similar. Have a look at LabelToNode and NodeToLabel in ARQ:
https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/tags/jena-arq-2.9.0-incubating/src/main/java/org/openjena/riot/lang/LabelToNode.java
https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/tags/jena-arq-2.9.0-incubating/src/main/java/org/openjena/riot/out/NodeToLabel.java
and how they are used in RIOT.

Paolo

Martynas Jusevicius wrote:
> Hey all,
> 
> what would be the easiest way to control the generation of blank node IDs?
> What I need is to make them globally unique (I'm thinking UUIDs or
> smth like that), because that's the approach taken by the triple store
> I'm using.
> Currently I load several documents into the store with the same
> Jena-generated IDs (A0, A1, etc), and they sit nicely in their
> separate named graphs. However when triples containing those bnodes
> get re-serialized into a single RDF/XML result, the IDs are matching
> where they shouldn't be.
> 
> Martynas
> graphity.org

Re: Controlling blank node IDs

Reply via email to