Re: consistent blank id values from RDFConnection

Claude Warren Sun, 17 Dec 2017 13:44:18 -0800

Based on your comments, I need round tripping.  I am assuming that the
SPARQL server will return the same ID for the same blank nodes on multiple
calls.  I understand that not all servers will meet that requirement.


In my case I am using SPARQL.  I execute all queries via a RDFConnection to
simplify my code and allow it to work for both local and remote datasets.

Cache invalidation:  Not fully baked yet.  I am doing this in the context
of PA4RDF, so there is some leeway with respect to synchronizing the local
with remote data.  Currently, I expire contents when there are no more
references to the subject (i.e. no more POJOs for that subject).  There is
also a PA4RDF entity manager call that forces a resync.  I plan to include
a thread that will update the subjects.

My implementation creates a graph for each concrete subject.  Any blank
nodes in objects are queried for so that the triples with blank node
subjects are in the same graph as the subject node.  All subject nodes are
added to an index that points to the model they are defined in.

I am certain that there are cases where this does not work as expected but
the graph works and passed all graph tests  when the blank nodes are
consistently identified.


Claude

On Sun, Dec 17, 2017 at 8:20 PM, Andy Seaborne <a...@apache.org> wrote:

> Same here - this is the current use for RDF Delta where $job is running
> caches of graphs across a number of Tomcat servers. Blank nodes are handled
> by system id.  There is a request for examples on
> https://afs.github.io/rdf-delta.  Caching is one of several uses case for
> the system and one that works.
>
> On 17/12/17 17:47, Claude Warren wrote:
>
>> My requirement centres around a caching graph.
>>
>> For my requirements I know that I am interested in subjects of triples so
>> I
>> can cache based on the subject.  Long story short.
>>
>
> How do you do cache invalidation?
>
> RDF Delta has "delete triple" patches.
>
> When I query the RDFConnection for the second time I need the blank values
>> to be the same as the first time so that I can properly update the cache.
>> I do the updates with blanks in a cleaner manner where I always backtrack
>> to a non-blank subject so I can build from that.
>>
>
> So do you need round tripping,not just client-side correlation of blank
> nodes ids?  if you fetch data based on internal id, you need the round
> tripping.
>
> Is the fetch side done by SPARQL or in code?
>
>     Andy
>
>
>
>> Claude
>>
>> On Sun, Dec 17, 2017 at 3:38 PM, Andy Seaborne <a...@apache.org> wrote:
>>
>>
>>>
>>> On 17/12/17 15:19, ajs6f wrote:
>>>
>>> Claude-- I'm looking at RDFConnection, but it's an interface. I think you
>>>> mean around L220 of JSONInput itself, right?
>>>>
>>>> It looks like SyntaxLabels has some LabelToNode factory methods that
>>>> might fit the bill, like createNodeToLabelAsGiven(), but JSONInput
>>>> doesn't
>>>> offer any way to select which method to use. At L195 it uses
>>>> SyntaxLabels.createLabelToNode().
>>>>
>>>> We could thread such a mapping choice all the way through the call
>>>> stack,
>>>> but that seems a bit difficult to me. Maybe we could introduce a Context
>>>> setting for this purpose?
>>>>
>>>>
>>> They already exist!
>>>
>>> ajs6f
>>>
>>>>
>>>> On Dec 17, 2017, at 9:28 AM, Claude Warren <cla...@xenei.com> wrote:
>>>>
>>>>>
>>>>> Greetings,
>>>>>
>>>>> I am looking at org.apache.jena.sparql.resultset.JSONInput and the way
>>>>> in
>>>>> which it parses blank nodes.
>>>>>
>>>>> I have a requirement for an application such that the same blank node
>>>>> returned on multiple queries returns the same blank node id.
>>>>>
>>>>>
>>>> Claude - I have the same requirement, and was checking and working on it
>>> only yesterday.
>>>
>>> What is the requirement and use case here?
>>>
>>> Mine is for update rdf-delta.
>>>
>>> (1) Pushing patches involving blank nodes (one way requirement)
>>> (2) Searching the graph and sending changes based on bnodes (round trip
>>> requirement).
>>>
>>> There are bits and pieces in different places and it could do with
>>> pulling
>>> together.  It's (they - there are two mechanisms) been around for a long
>>> time and there are a few things to sort out as handling isn't consistent
>>> ATM and a couple of code paths have been lost/not written.
>>>
>>> XML results works if setup up right.
>>>
>>> JSONInput on its own isn't sufficient.  What did you enable in Fuseki? It
>>> would be good to know what works already.
>>>
>>> BTW - this mode is dubious in terms of spec compliance.  It also happens
>>> to be very useful.
>>>
>>>      Andy
>>>
>>>
>>>
>>> I have verified that Fuseki does this (given that the data is only loaded
>>>>> once -- reloading the data can renumber the nodes).  In any case Fuseki
>>>>> seems to return the ids from the underlying data store.
>>>>>
>>>>> However, when RDFConnection is executing a query it remaps the ids
>>>>> during
>>>>> the query.
>>>>>
>>>>> Down around line 220 it uses a labelMap to construct a new value for
>>>>> the
>>>>> bnode.  My question is:
>>>>>
>>>>> Is there a simple way to have the LabelMap return the same value for
>>>>> the
>>>>> same blank node across multiple queries? (assuming the value does not
>>>>> change in the data store).
>>>>>
>>>>> I know there was a discussion of using UUIDs or some such to generate
>>>>> the
>>>>> blank ids on the way into the graph
>>>>> but I don't see any way for
>>>>> RDFConnection to return them consistently.
>>>>>
>>>>> Claude
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> I like: Like Like - The likeliest place on the web
>>>>> <http://like-like.xenei.com>
>>>>> LinkedIn: http://www.linkedin.com/in/claudewarren
>>>>>
>>>>>
>>>>
>>>>
>>
>>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: consistent blank id values from RDFConnection

Reply via email to