Re: [RDF] Local Scope and BlankNode internalIdentifier (was: github Commons RDF vs. Apache Commons Sandbox RDF)

Stian Soiland-Reyes Tue, 27 Jan 2015 09:12:45 -0800

I agree that "local scope" should be clarified - I have been just as confused!

By keeping the "internalIdentifier" property, an application is able
to talk about an existing blankNode without having to keep track of
earlier BlankNode instances (e.g. not needing their own
Map<internalIdentifier,BlankNode>).

It also means that a streaming copy from one implementation to another
would work - even if there would be multiple JVM objects "on the line"
representing the same BlankNode - having the same internalIdentifier.

That said, nothing is preventing
RDFTermFactory.createBlankNode(internalIdentifier) from always
returning the same JVM object through some kind of lookup - as long as
that object then is able to live in multiple "local scopes" or
Graph.add()  copies it to set the scope.

There's an open issue about what is the extent of this "local scope"
and how this affects equivalence.

https://github.com/commons-rdf/commons-rdf/issues/56

My attempt to confuse this further:

https://github.com/commons-rdf/commons-rdf/pull/48

Some earlier discussions about equals:

https://github.com/commons-rdf/commons-rdf/issues/45

I think your FAQ in
https://svn.apache.org/repos/asf/commons/sandbox/rdf/trunk/README.md

is great - but this seems to include some JVM-specific decisions that
might be easy to do in all the RDF implementations.

I think we have agreed that the localIdentifier doesn't have to do
anything with the ntriplesString, which I have reflected in the tests.
Thus a local identifier like "not: a URI or anything" is fine - all we
know is that two BlankNode with the same local identifier in the same
Graph should be equal, and that their ntriplesString - whatever it is
(I do UUID v3 if the id doesn't work) should also be equal.

What is unclear is how this "local scope" propagates - as it's not
exposed anywhere in the current interfaces.

Perhaps blank nodes should only be possible to create from/with a Graph?

When you say a scope could be narrower.. what do you mean, narrower
than a Graph? I guess say from a SPARQL result set using the Commons
RDF API (but not Graph), the scope would be that particular result
set.

Andy has said he would like the ability to copy such a BlankNode to a
different graph, then back again to the first, and then be equal to
the original BlankNode. (Not sure if this was meant with inserting the
same BlankNode object into two Graphs directly, or making a single or
tTriple instance that is added into two Graphs).

It is unclear if that BlankNode object in graph 2 will have the same
"local scope" as the BlankNode in graph 1. Is a Triple added to two
graphs now in two local scopes?

In the 'simple' implementation achieved the back-and-forward
equivalence by keeping a "local scope" as an Optional<Graph> within
the BlankNodeImpl, and use this as part of equivalence:

https://github.com/commons-rdf/commons-rdf/blob/master/simple/src/main/java/com/github/commonsrdf/simple/BlankNodeImpl.java#L40

Should a a "free-standing" BlankNodeImpl (not inside a Triple) claim
to be equal to, or NOT equal to another BlanNodeImpl with same
localIdentifier if neither are in any scope?  Currently I think my
implementation does the first of this.

On Graph.add(Triple) I always make a clone of TripleImpl (to not
overwrite the localScope), which will call "inScope" to clone the
BlankNode with the new graph as scope.

https://github.com/commons-rdf/commons-rdf/blob/master/simple/src/main/java/com/github/commonsrdf/simple/TripleImpl.java#L63

But I see now that with the split Graph.add(s,p,o) form I don't
propagate the Graph localScope correctly and might even cause a NPE:

https://github.com/commons-rdf/commons-rdf/blob/master/simple/src/main/java/com/github/commonsrdf/simple/GraphImpl.java#L43
https://github.com/commons-rdf/commons-rdf/blob/master/simple/src/main/java/com/github/commonsrdf/simple/TripleImpl.java#L46

.. so this is tricky to get right!

<rant>
Whoever invented Blank Nodes... why not just
<urn:uuid:7096a534-d698-414c-87fa-4b09ca5d03f2> and be done with it.
If something exists, it exists.. just give it a name - anything! Names
come cheap - at least now that we got rid of LSID servers :)
</rant>

On 27 January 2015 at 13:39, Reto Gmür <r...@apache.org> wrote:
> On Fri, Jan 16, 2015 at 12:29 AM, Peter Ansell <ansell.pe...@gmail.com>
> wrote:
>
>> The only sticking point then and now IMO is the purely academic
>> distinction of opening up internal labels for blank nodes versus not
>> opening it up at all. Reto is against having the API allow access to
>> the identifiers on academic grounds, where other systems pragmatically
>> allow it with heavily worded javadoc contracts about their limited
>> usefulness, per the RDF specifications:
>>
>
> Hi Peter,
>
> Sorry for the late reply.
>
> I see that the javadoc for the internalIdentifier method has now become
> quite long.
>
> It says:
>
> * In particular, the existence of two objects of type {@link BlankNode}  *
> with the same value returned from {@link #internalIdentifier()} are not  *
> equivalent unless they are known to have been created in the same local  *
> scope.
> It is however not so clear what such a local scope is. It says that such a
> local scope may be for example a JVM instance.  Can the scope also be
> narrower? To allow removing redundancies (as described in
> https://svn.apache.org/repos/asf/commons/sandbox/rdf/trunk/README.md) no
> promise should be made that a bnode with the same ID in the same JVM will
> denote the same node. On the other hand, how long is it guaranteed thath if
> I have a BNode objects I can add triples to a graph and this object will
> keep representing the same RDF Node? Does it make a difference if I keep
> the instance or is I create a new instance with the same internal
> identifier?
>
> Similarly: can I add bnodes I get form one graph form one implementation to
> another? If I get BNode :foo from G1 can I add the triple (:foo ex:p ex:o)
> it to G2? When later or I will add (:foo ex:q ex:r) to G2 will the two
> triples have the same subject?
>
> I think these are important questions to allow generic interoperable
> implementations. I'm not saying that questions like the one I answer in the
> Readme of my draft cannot be satisfactory answered when having such an
> internal identifier, but I think it might get more complicated and less
> intuitive for the user.
>
> Also, you're writing about "opening up" the labels. This make sense from a
> triple store perspective where the BNode have such an internal label.
> However I think this should not be the only usecase scenario. One can very
> well use the RDF API to expose some arbitrary java (data) objects as RDF.
> I've illustrated such a scenario here:
>
> http://mail-archives.apache.org/mod_mbox/stanbol-dev/201211.mbox/%3CCALvhUEUfOd-mLBh-=xkwblajhbcboe963hdxv6g0jhnpj6c...@mail.gmail.com%3E
>
> I'm not sure if with the github API one could say "the scope is the node
> instance" and return a fixed identifier for all BNode. If so the identifier
> is obviously pointless. If on the other hand one would have to assign
> identifier to all the objects the complexity of the implementation this
> would make implementations more complex both in terms of code as in terms
> of memory usage.
>
> Again, it seems to make things more complex while I see no clear advantage
> comparing with the toString() method the object has anyway.
>
> Cheers,
> Reto

-- 
Stian Soiland-Reyes
Apache Taverna (incubating)
http://orcid.org/0000-0001-9842-9718

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [RDF] Local Scope and BlankNode internalIdentifier (was: github Commons RDF vs. Apache Commons Sandbox RDF)

Reply via email to