Re: Listeners and remote Graphs.

2017-01-12 Thread Claude Warren
Yeah, but I am not certain that Cassandra will return them in the "right
order".  I have to delve into that a bit more.

On Thu, Jan 12, 2017 at 10:48 AM, Andy Seaborne  wrote:

>
>
> On 11/01/17 12:24, Claude Warren wrote:
>
>> Cassandra does not support transactions.  However, I can see the use for a
>> bounding construct that indicates "all these triples were added/deleted."
>>
>> The original driver for the Cassandra implementation was that we needed
>> one
>> for a use case at work.  However, Cassandra is generally about speed of
>> writes and scale. So expect that the Cassandra implementation will be
>> viewed as large scale.  This leads me to some concerns about hash joins
>> but
>> that is a side issue to be addressed in the Op stuff for the Cassandra
>> implementation.
>>
>
> IF the results come back (1) streaming and (2) in the right order
> THEN
>   Merge join.
> FI
>
>
>
>
>
>> Perhaps we should look at a design for distributed update notification?
>>
>> Claude
>>
>> On Wed, Jan 11, 2017 at 11:53 AM, Andy Seaborne  wrote:
>>
>> Hi Claude,
>>>
>>> On 09/01/17 19:22, Claude Warren wrote:
>>>
>>> Greetings,

 Given that  the Cassandra server can host multiple client and  those
 clients can open the same graph on the server simultaneously.  Basically
 two updatable synchronized views on one data set.

 Assume graph A is opened on client X and client Y and applications at X
 and
 Y both register listeners on the graph.

 If application at X deletes a triple should the listener at Y be
 notified?

 I have been thinking about adding a queue based (JMS 1.1?) listener
 implementation so that distributed system would be notified of changes
 from
 remote systems.


>>>
>>> The second question deals with reasoners.  If the reasoners are using the
>>>
 distributed graph store then I don't think there is an issue so perhaps
 this one goes away but.

 Reasoners do not like it  (don't respond to) data that is written into
 the
 graph behind the scenes.  In a distributed environment does it make
 sense
 to somehow utilize the graph listen messages (as noted above) to fire
 update rules?


>>> The current reasoners are written assuming changes happen via their API.
>>>
>>>
>>> Are there other issues that I have missed?


>>> The current event mechanism provides synchronous events - when the "add"
>>> has returned the event has been fired.
>>>
>>>
>>> The Cassandra graph is about scale?
>>>
>>> Per triple events are very fine grained - what if a million triples are
>>> added - will there be a million events?  And the whole update is sent to
>>> all listeners by JMS?
>>>
>>> 100 million triples?
>>>
>>>
>>> Where I'm going with Delta is that changes get recorded (RDF Patch) so
>>> that they can be applied elsewhere but also inspected. The granularity is
>>> different - the grain size is the transaction - a bunch of adds and
>>> deletes.
>>>
>>> So a different design is that an even on change at the end of transaction
>>> and that event is just notification something has happened and the event
>>> listeners can choose to inspect the inspect the change or not.
>>>
>>> I've also done where the patch drives sending an SPARQL Update to a
>>> remote
>>> copy - generally pure-push systems rather ones that pull the bulk changes
>>> because of issues like whether the client is ready to process the event,
>>> is
>>> actually running, there is a comms glitch or whether the client is
>>> actually
>>> interested in the content - they may, like the reasoners, be interested
>>> that a change has happened but their action is not simply on the content
>>> of
>>> just the change.
>>>
>>> Andy
>>>
>>>
>>
>>
>>


-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: Listeners and remote Graphs.

2017-01-12 Thread Andy Seaborne



On 11/01/17 12:24, Claude Warren wrote:

Cassandra does not support transactions.  However, I can see the use for a
bounding construct that indicates "all these triples were added/deleted."

The original driver for the Cassandra implementation was that we needed one
for a use case at work.  However, Cassandra is generally about speed of
writes and scale. So expect that the Cassandra implementation will be
viewed as large scale.  This leads me to some concerns about hash joins but
that is a side issue to be addressed in the Op stuff for the Cassandra
implementation.


IF the results come back (1) streaming and (2) in the right order
THEN
  Merge join.
FI





Perhaps we should look at a design for distributed update notification?

Claude

On Wed, Jan 11, 2017 at 11:53 AM, Andy Seaborne  wrote:


Hi Claude,

On 09/01/17 19:22, Claude Warren wrote:


Greetings,

Given that  the Cassandra server can host multiple client and  those
clients can open the same graph on the server simultaneously.  Basically
two updatable synchronized views on one data set.

Assume graph A is opened on client X and client Y and applications at X
and
Y both register listeners on the graph.

If application at X deletes a triple should the listener at Y be notified?

I have been thinking about adding a queue based (JMS 1.1?) listener
implementation so that distributed system would be notified of changes
from
remote systems.




The second question deals with reasoners.  If the reasoners are using the

distributed graph store then I don't think there is an issue so perhaps
this one goes away but.

Reasoners do not like it  (don't respond to) data that is written into the
graph behind the scenes.  In a distributed environment does it make sense
to somehow utilize the graph listen messages (as noted above) to fire
update rules?



The current reasoners are written assuming changes happen via their API.



Are there other issues that I have missed?



The current event mechanism provides synchronous events - when the "add"
has returned the event has been fired.


The Cassandra graph is about scale?

Per triple events are very fine grained - what if a million triples are
added - will there be a million events?  And the whole update is sent to
all listeners by JMS?

100 million triples?


Where I'm going with Delta is that changes get recorded (RDF Patch) so
that they can be applied elsewhere but also inspected. The granularity is
different - the grain size is the transaction - a bunch of adds and deletes.

So a different design is that an even on change at the end of transaction
and that event is just notification something has happened and the event
listeners can choose to inspect the inspect the change or not.

I've also done where the patch drives sending an SPARQL Update to a remote
copy - generally pure-push systems rather ones that pull the bulk changes
because of issues like whether the client is ready to process the event, is
actually running, there is a comms glitch or whether the client is actually
interested in the content - they may, like the reasoners, be interested
that a change has happened but their action is not simply on the content of
just the change.

Andy







Re: Listeners and remote Graphs.

2017-01-11 Thread Claude Warren
To be honest, I was thinking that if the graph notification for the
cassandra graph needed to notify all the attached graph implementations of
changes I would use Kafka or other similar pluggable Queue to do the
notification.

Claude

On Wed, Jan 11, 2017 at 2:48 PM, A. Soroka  wrote:

> > Perhaps we should look at a design for distributed update notification?
>
> If this is something that seems interesting, it's worth noting that there
> is a bunch of work on which to rely for this kind of problem. For example,
> colleagues of mine have had success with Apache Kafka. No need for Jena to
> reinvent any wheels.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Jan 11, 2017, at 7:24 AM, Claude Warren  wrote:
> >
> > Cassandra does not support transactions.  However, I can see the use for
> a
> > bounding construct that indicates "all these triples were added/deleted."
> >
> > The original driver for the Cassandra implementation was that we needed
> one
> > for a use case at work.  However, Cassandra is generally about speed of
> > writes and scale. So expect that the Cassandra implementation will be
> > viewed as large scale.  This leads me to some concerns about hash joins
> but
> > that is a side issue to be addressed in the Op stuff for the Cassandra
> > implementation.
> >
> > Perhaps we should look at a design for distributed update notification?
> >
> > Claude
> >
> > On Wed, Jan 11, 2017 at 11:53 AM, Andy Seaborne  wrote:
> >
> >> Hi Claude,
> >>
> >> On 09/01/17 19:22, Claude Warren wrote:
> >>
> >>> Greetings,
> >>>
> >>> Given that  the Cassandra server can host multiple client and  those
> >>> clients can open the same graph on the server simultaneously.
> Basically
> >>> two updatable synchronized views on one data set.
> >>>
> >>> Assume graph A is opened on client X and client Y and applications at X
> >>> and
> >>> Y both register listeners on the graph.
> >>>
> >>> If application at X deletes a triple should the listener at Y be
> notified?
> >>>
> >>> I have been thinking about adding a queue based (JMS 1.1?) listener
> >>> implementation so that distributed system would be notified of changes
> >>> from
> >>> remote systems.
> >>>
> >>
> >>
> >> The second question deals with reasoners.  If the reasoners are using
> the
> >>> distributed graph store then I don't think there is an issue so perhaps
> >>> this one goes away but.
> >>>
> >>> Reasoners do not like it  (don't respond to) data that is written into
> the
> >>> graph behind the scenes.  In a distributed environment does it make
> sense
> >>> to somehow utilize the graph listen messages (as noted above) to fire
> >>> update rules?
> >>>
> >>
> >> The current reasoners are written assuming changes happen via their API.
> >>
> >>
> >>> Are there other issues that I have missed?
> >>>
> >>
> >> The current event mechanism provides synchronous events - when the "add"
> >> has returned the event has been fired.
> >>
> >>
> >> The Cassandra graph is about scale?
> >>
> >> Per triple events are very fine grained - what if a million triples are
> >> added - will there be a million events?  And the whole update is sent to
> >> all listeners by JMS?
> >>
> >> 100 million triples?
> >>
> >>
> >> Where I'm going with Delta is that changes get recorded (RDF Patch) so
> >> that they can be applied elsewhere but also inspected. The granularity
> is
> >> different - the grain size is the transaction - a bunch of adds and
> deletes.
> >>
> >> So a different design is that an even on change at the end of
> transaction
> >> and that event is just notification something has happened and the event
> >> listeners can choose to inspect the inspect the change or not.
> >>
> >> I've also done where the patch drives sending an SPARQL Update to a
> remote
> >> copy - generally pure-push systems rather ones that pull the bulk
> changes
> >> because of issues like whether the client is ready to process the
> event, is
> >> actually running, there is a comms glitch or whether the client is
> actually
> >> interested in the content - they may, like the reasoners, be interested
> >> that a change has happened but their action is not simply on the
> content of
> >> just the change.
> >>
> >>Andy
> >>
> >
> >
> >
> > --
> > I like: Like Like - The likeliest place on the web
> > 
> > LinkedIn: http://www.linkedin.com/in/claudewarren
>
>


-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: Listeners and remote Graphs.

2017-01-11 Thread Claude Warren
Cassandra does not support transactions.  However, I can see the use for a
bounding construct that indicates "all these triples were added/deleted."

The original driver for the Cassandra implementation was that we needed one
for a use case at work.  However, Cassandra is generally about speed of
writes and scale. So expect that the Cassandra implementation will be
viewed as large scale.  This leads me to some concerns about hash joins but
that is a side issue to be addressed in the Op stuff for the Cassandra
implementation.

Perhaps we should look at a design for distributed update notification?

Claude

On Wed, Jan 11, 2017 at 11:53 AM, Andy Seaborne  wrote:

> Hi Claude,
>
> On 09/01/17 19:22, Claude Warren wrote:
>
>> Greetings,
>>
>> Given that  the Cassandra server can host multiple client and  those
>> clients can open the same graph on the server simultaneously.  Basically
>> two updatable synchronized views on one data set.
>>
>> Assume graph A is opened on client X and client Y and applications at X
>> and
>> Y both register listeners on the graph.
>>
>> If application at X deletes a triple should the listener at Y be notified?
>>
>> I have been thinking about adding a queue based (JMS 1.1?) listener
>> implementation so that distributed system would be notified of changes
>> from
>> remote systems.
>>
>
>
> The second question deals with reasoners.  If the reasoners are using the
>> distributed graph store then I don't think there is an issue so perhaps
>> this one goes away but.
>>
>> Reasoners do not like it  (don't respond to) data that is written into the
>> graph behind the scenes.  In a distributed environment does it make sense
>> to somehow utilize the graph listen messages (as noted above) to fire
>> update rules?
>>
>
> The current reasoners are written assuming changes happen via their API.
>
>
>> Are there other issues that I have missed?
>>
>
> The current event mechanism provides synchronous events - when the "add"
> has returned the event has been fired.
>
>
> The Cassandra graph is about scale?
>
> Per triple events are very fine grained - what if a million triples are
> added - will there be a million events?  And the whole update is sent to
> all listeners by JMS?
>
> 100 million triples?
>
>
> Where I'm going with Delta is that changes get recorded (RDF Patch) so
> that they can be applied elsewhere but also inspected. The granularity is
> different - the grain size is the transaction - a bunch of adds and deletes.
>
> So a different design is that an even on change at the end of transaction
> and that event is just notification something has happened and the event
> listeners can choose to inspect the inspect the change or not.
>
> I've also done where the patch drives sending an SPARQL Update to a remote
> copy - generally pure-push systems rather ones that pull the bulk changes
> because of issues like whether the client is ready to process the event, is
> actually running, there is a comms glitch or whether the client is actually
> interested in the content - they may, like the reasoners, be interested
> that a change has happened but their action is not simply on the content of
> just the change.
>
> Andy
>



-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: Listeners and remote Graphs.

2017-01-11 Thread Andy Seaborne

Hi Claude,

On 09/01/17 19:22, Claude Warren wrote:

Greetings,

Given that  the Cassandra server can host multiple client and  those
clients can open the same graph on the server simultaneously.  Basically
two updatable synchronized views on one data set.

Assume graph A is opened on client X and client Y and applications at X and
Y both register listeners on the graph.

If application at X deletes a triple should the listener at Y be notified?

I have been thinking about adding a queue based (JMS 1.1?) listener
implementation so that distributed system would be notified of changes from
remote systems.




The second question deals with reasoners.  If the reasoners are using the
distributed graph store then I don't think there is an issue so perhaps
this one goes away but.

Reasoners do not like it  (don't respond to) data that is written into the
graph behind the scenes.  In a distributed environment does it make sense
to somehow utilize the graph listen messages (as noted above) to fire
update rules?


The current reasoners are written assuming changes happen via their API.



Are there other issues that I have missed?


The current event mechanism provides synchronous events - when the "add" 
has returned the event has been fired.



The Cassandra graph is about scale?

Per triple events are very fine grained - what if a million triples are 
added - will there be a million events?  And the whole update is sent to 
all listeners by JMS?


100 million triples?


Where I'm going with Delta is that changes get recorded (RDF Patch) so 
that they can be applied elsewhere but also inspected. The granularity 
is different - the grain size is the transaction - a bunch of adds and 
deletes.


So a different design is that an even on change at the end of 
transaction and that event is just notification something has happened 
and the event listeners can choose to inspect the inspect the change or not.


I've also done where the patch drives sending an SPARQL Update to a 
remote copy - generally pure-push systems rather ones that pull the bulk 
changes because of issues like whether the client is ready to process 
the event, is actually running, there is a comms glitch or whether the 
client is actually interested in the content - they may, like the 
reasoners, be interested that a change has happened but their action is 
not simply on the content of just the change.


Andy