> Perhaps we should look at a design for distributed update notification?

If this is something that seems interesting, it's worth noting that there is a 
bunch of work on which to rely for this kind of problem. For example, 
colleagues of mine have had success with Apache Kafka. No need for Jena to 
reinvent any wheels.

---
A. Soroka
The University of Virginia Library

> On Jan 11, 2017, at 7:24 AM, Claude Warren <cla...@xenei.com> wrote:
> 
> Cassandra does not support transactions.  However, I can see the use for a
> bounding construct that indicates "all these triples were added/deleted."
> 
> The original driver for the Cassandra implementation was that we needed one
> for a use case at work.  However, Cassandra is generally about speed of
> writes and scale. So expect that the Cassandra implementation will be
> viewed as large scale.  This leads me to some concerns about hash joins but
> that is a side issue to be addressed in the Op stuff for the Cassandra
> implementation.
> 
> Perhaps we should look at a design for distributed update notification?
> 
> Claude
> 
> On Wed, Jan 11, 2017 at 11:53 AM, Andy Seaborne <a...@apache.org> wrote:
> 
>> Hi Claude,
>> 
>> On 09/01/17 19:22, Claude Warren wrote:
>> 
>>> Greetings,
>>> 
>>> Given that  the Cassandra server can host multiple client and  those
>>> clients can open the same graph on the server simultaneously.  Basically
>>> two updatable synchronized views on one data set.
>>> 
>>> Assume graph A is opened on client X and client Y and applications at X
>>> and
>>> Y both register listeners on the graph.
>>> 
>>> If application at X deletes a triple should the listener at Y be notified?
>>> 
>>> I have been thinking about adding a queue based (JMS 1.1?) listener
>>> implementation so that distributed system would be notified of changes
>>> from
>>> remote systems.
>>> 
>> 
>> 
>> The second question deals with reasoners.  If the reasoners are using the
>>> distributed graph store then I don't think there is an issue so perhaps
>>> this one goes away but.....
>>> 
>>> Reasoners do not like it  (don't respond to) data that is written into the
>>> graph behind the scenes.  In a distributed environment does it make sense
>>> to somehow utilize the graph listen messages (as noted above) to fire
>>> update rules?
>>> 
>> 
>> The current reasoners are written assuming changes happen via their API.
>> 
>> 
>>> Are there other issues that I have missed?
>>> 
>> 
>> The current event mechanism provides synchronous events - when the "add"
>> has returned the event has been fired.
>> 
>> 
>> The Cassandra graph is about scale?
>> 
>> Per triple events are very fine grained - what if a million triples are
>> added - will there be a million events?  And the whole update is sent to
>> all listeners by JMS?
>> 
>> 100 million triples?
>> 
>> 
>> Where I'm going with Delta is that changes get recorded (RDF Patch) so
>> that they can be applied elsewhere but also inspected. The granularity is
>> different - the grain size is the transaction - a bunch of adds and deletes.
>> 
>> So a different design is that an even on change at the end of transaction
>> and that event is just notification something has happened and the event
>> listeners can choose to inspect the inspect the change or not.
>> 
>> I've also done where the patch drives sending an SPARQL Update to a remote
>> copy - generally pure-push systems rather ones that pull the bulk changes
>> because of issues like whether the client is ready to process the event, is
>> actually running, there is a comms glitch or whether the client is actually
>> interested in the content - they may, like the reasoners, be interested
>> that a change has happened but their action is not simply on the content of
>> just the change.
>> 
>>        Andy
>> 
> 
> 
> 
> -- 
> I like: Like Like - The likeliest place on the web
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to